Paper Group ANR 1169
Wasserstein Distance based Deep Adversarial Transfer Learning for Intelligent Fault Diagnosis. AdversarialNAS: Adversarial Neural Architecture Search for GANs. Red Dragon AI at TextGraphs 2019 Shared Task: Language Model Assisted Explanation Generation. Online information of vaccines: information quality is an ethical responsibility of search engin …
Wasserstein Distance based Deep Adversarial Transfer Learning for Intelligent Fault Diagnosis
Title | Wasserstein Distance based Deep Adversarial Transfer Learning for Intelligent Fault Diagnosis |
Authors | Cheng Cheng, Beitong Zhou, Guijun Ma, Dongrui Wu, Ye Yuan |
Abstract | The demand of artificial intelligent adoption for condition-based maintenance strategy is astonishingly increased over the past few years. Intelligent fault diagnosis is one critical topic of maintenance solution for mechanical systems. Deep learning models, such as convolutional neural networks (CNNs), have been successfully applied to fault diagnosis tasks for mechanical systems and achieved promising results. However, for diverse working conditions in the industry, deep learning suffers two difficulties: one is that the well-defined (source domain) and new (target domain) datasets are with different feature distributions; another one is the fact that insufficient or no labelled data in target domain significantly reduce the accuracy of fault diagnosis. As a novel idea, deep transfer learning (DTL) is created to perform learning in the target domain by leveraging information from the relevant source domain. Inspired by Wasserstein distance of optimal transport, in this paper, we propose a novel DTL approach to intelligent fault diagnosis, namely Wasserstein Distance based Deep Transfer Learning (WD-DTL), to learn domain feature representations (generated by a CNN based feature extractor) and to minimize the distributions between the source and target domains through adversarial training. The effectiveness of the proposed WD-DTL is verified through 3 transfer scenarios and 16 transfer fault diagnosis experiments of both unsupervised and supervised (with insufficient labelled data) learning. We also provide a comprehensive analysis of the network visualization of those transfer tasks. |
Tasks | Transfer Learning |
Published | 2019-03-02 |
URL | http://arxiv.org/abs/1903.06753v1 |
http://arxiv.org/pdf/1903.06753v1.pdf | |
PWC | https://paperswithcode.com/paper/wasserstein-distance-based-deep-adversarial |
Repo | |
Framework | |
AdversarialNAS: Adversarial Neural Architecture Search for GANs
Title | AdversarialNAS: Adversarial Neural Architecture Search for GANs |
Authors | Chen Gao, Yunpeng Chen, Si Liu, Zhenxiong Tan, Shuicheng Yan |
Abstract | Neural Architecture Search (NAS) that aims to automate the procedure of architecture design has achieved promising results in many computer vision fields. In this paper, we propose an AdversarialNAS method specially tailored for Generative Adversarial Networks (GANs) to search for a superior generative model on the task of unconditional image generation. The proposed method leverages an adversarial searching mechanism to search for the architectures of generator and discriminator simultaneously in a differentiable manner. Therefore, the searching algorithm considers the relevance and balance between the two networks leading to search for a superior generative model. Besides, AdversarialNAS does not need any extra evaluation metric to evaluate the performance of the architecture in each searching iteration, which is very efficient and can take only 1 GPU day to search for an optimal network architecture in a large search space ($10^{38}$). Experiments demonstrate the effectiveness and superiority of our method. The discovered generative model sets a new state-of-the-art FID score of $10.87$ and highly competitive Inception Score of $8.74$ on CIFAR-10. Its transferability is also proven by setting new state-of-the-art FID score of $26.98$ and Inception score of $9.63$ on STL-10. Our code will be released to facilitate the related academic and industrial study. |
Tasks | Image Generation, Neural Architecture Search |
Published | 2019-12-04 |
URL | https://arxiv.org/abs/1912.02037v1 |
https://arxiv.org/pdf/1912.02037v1.pdf | |
PWC | https://paperswithcode.com/paper/adversarialnas-adversarial-neural |
Repo | |
Framework | |
Red Dragon AI at TextGraphs 2019 Shared Task: Language Model Assisted Explanation Generation
Title | Red Dragon AI at TextGraphs 2019 Shared Task: Language Model Assisted Explanation Generation |
Authors | Yew Ken Chia, Sam Witteveen, Martin Andrews |
Abstract | The TextGraphs-13 Shared Task on Explanation Regeneration asked participants to develop methods to reconstruct gold explanations for elementary science questions. Red Dragon AI’s entries used the language of the questions and explanation text directly, rather than a constructing a separate graph-like representation. Our leaderboard submission placed us 3rd in the competition, but we present here three methods of increasing sophistication, each of which scored successively higher on the test set after the competition close. |
Tasks | Language Modelling |
Published | 2019-11-20 |
URL | https://arxiv.org/abs/1911.08976v1 |
https://arxiv.org/pdf/1911.08976v1.pdf | |
PWC | https://paperswithcode.com/paper/red-dragon-ai-at-textgraphs-2019-shared-task-1 |
Repo | |
Framework | |
Online information of vaccines: information quality is an ethical responsibility of search engines
Title | Online information of vaccines: information quality is an ethical responsibility of search engines |
Authors | Pietro Ghezzi, Peter G Bannister, Gonzalo Casino, Alessia Catalani, Michel Goldman, Jessica Morley, Marie Neunez, Andreu Prados, Mariarosaria Taddeo, Tania Vanzolini, Luciano Floridi |
Abstract | The fact that internet companies may record our personal data and track our online behavior for commercial or political purpose has emphasized aspects related to online privacy. This has also led to the development of search engines that promise no tracking and privacy. Search engines also have a major role in spreading low-quality health information such as that of anti-vaccine websites. This study investigates the relationship between search engines’ approach to privacy and the scientific quality of the information they return. We analyzed the first 30 webpages returned searching ‘vaccines autism’ in English, Spanish, Italian and French. The results show that alternative search engines (Duckduckgo, Ecosia, Qwant, Swisscows and Mojeek) may return more anti-vaccine pages (10 to 53 percent) than Google.com (zero). Some localized versions of Google, however, returned more anti-vaccine webpages (up to 10 percent) than Google.com. Our study suggests that designing a search engine that is privacy savvy and avoids issues with filter bubbles that can result from user tracking is necessary but insufficient; instead, mechanisms should be developed to test search engines from the perspective of information quality (particularly for health-related webpages), before they can be deemed trustworthy providers of public health information. |
Tasks | |
Published | 2019-12-02 |
URL | https://arxiv.org/abs/1912.00898v1 |
https://arxiv.org/pdf/1912.00898v1.pdf | |
PWC | https://paperswithcode.com/paper/online-information-of-vaccines-information |
Repo | |
Framework | |
A multi-label, dual-output deep neural network for automated bug triaging
Title | A multi-label, dual-output deep neural network for automated bug triaging |
Authors | Christopher A. Choquette-Choo, David Sheldon, Jonny Proppe, John Alphonso-Gibbs, Harsha Gupta |
Abstract | Bug tracking enables the monitoring and resolution of issues and bugs within organizations. Bug triaging, or assigning bugs to the owner(s) who will resolve them, is a critical component of this process because there are many incorrect assignments that waste developer time and reduce bug resolution throughput. In this work, we explore the use of a novel two-output deep neural network architecture (Dual DNN) for triaging a bug to both an individual team and developer, simultaneously. Dual DNN leverages this simultaneous prediction by exploiting its own guess of the team classes to aid in developer assignment. A multi-label classification approach is used for each of the two outputs to learn from all interim owners, not just the last one who closed the bug. We make use of a heuristic combination of the interim owners (owner-importance-weighted labeling) which is converted into a probability mass function (pmf). We employ a two-stage learning scheme, whereby the team portion of the model is trained first and then held static to train the team–developer and bug–developer relationships. The scheme employed to encode the team–developer relationships is based on an organizational chart (org chart), which renders the model robust to organizational changes as it can adapt to role changes within an organization. There is an observed average lift (with respect to both team and developer assignment) of 13%-points in 11-fold incremental-learning cross-validation (IL-CV) accuracy for Dual DNN utilizing owner-weighted labels compared with the traditional multi-class classification approach. Furthermore, Dual DNN with owner-weighted labels achieves average 11-fold IL-CV accuracies of 76% (team assignment) and 55% (developer assignment), outperforming reference models by 14%- and 25%-points, respectively, on a proprietary dataset with 236,865 entries. |
Tasks | Multi-Label Classification |
Published | 2019-10-13 |
URL | https://arxiv.org/abs/1910.05835v1 |
https://arxiv.org/pdf/1910.05835v1.pdf | |
PWC | https://paperswithcode.com/paper/a-multi-label-dual-output-deep-neural-network |
Repo | |
Framework | |
Evolution of Hierarchical Structure & Reuse in iGEM Synthetic DNA Sequences
Title | Evolution of Hierarchical Structure & Reuse in iGEM Synthetic DNA Sequences |
Authors | Payam Siyari, Bistra Dilkina, Constantine Dovrolis |
Abstract | Many complex systems, both in technology and nature, exhibit hierarchical modularity: smaller modules, each of them providing a certain function, are used within larger modules that perform more complex functions. Previously, we have proposed a modeling framework, referred to as Evo-Lexis, that provides insight to some fundamental questions about evolving hierarchical systems. The predictions of the Evo-Lexis model should be tested using real data from evolving systems in which the outputs can be well represented by sequences. In this paper, we investigate the time series of iGEM synthetic DNA dataset sequences, and whether the resulting iGEM hierarchies exhibit the qualitative properties predicted by the Evo-Lexis framework. Contrary to Evo-Lexis, in iGEM the amount of reuse decreases during the timeline of the dataset. Although this results in development of less cost-efficient and less deep Lexis-DAGs, the dataset exhibits a bias in reusing specific nodes more often than others. This results in the Lexis-DAGs to take the shape of an hourglass with relatively high H-score values and stable set of core nodes. Despite the reuse bias and stability of the core set, the dataset presents a high amount of diversity among the targets which is in line with modeling of Evo-Lexis. |
Tasks | Time Series |
Published | 2019-06-06 |
URL | https://arxiv.org/abs/1906.02446v1 |
https://arxiv.org/pdf/1906.02446v1.pdf | |
PWC | https://paperswithcode.com/paper/evolution-of-hierarchical-structure-reuse-in |
Repo | |
Framework | |
Optimization under Uncertainty in the Era of Big Data and Deep Learning: When Machine Learning Meets Mathematical Programming
Title | Optimization under Uncertainty in the Era of Big Data and Deep Learning: When Machine Learning Meets Mathematical Programming |
Authors | Chao Ning, Fengqi You |
Abstract | This paper reviews recent advances in the field of optimization under uncertainty via a modern data lens, highlights key research challenges and promise of data-driven optimization that organically integrates machine learning and mathematical programming for decision-making under uncertainty, and identifies potential research opportunities. A brief review of classical mathematical programming techniques for hedging against uncertainty is first presented, along with their wide spectrum of applications in Process Systems Engineering. A comprehensive review and classification of the relevant publications on data-driven distributionally robust optimization, data-driven chance constrained program, data-driven robust optimization, and data-driven scenario-based optimization is then presented. This paper also identifies fertile avenues for future research that focuses on a closed-loop data-driven optimization framework, which allows the feedback from mathematical programming to machine learning, as well as scenario-based optimization leveraging the power of deep learning techniques. Perspectives on online learning-based data-driven multistage optimization with a learning-while-optimizing scheme is presented. |
Tasks | Decision Making, Decision Making Under Uncertainty |
Published | 2019-04-03 |
URL | http://arxiv.org/abs/1904.01934v1 |
http://arxiv.org/pdf/1904.01934v1.pdf | |
PWC | https://paperswithcode.com/paper/optimization-under-uncertainty-in-the-era-of |
Repo | |
Framework | |
Extreme Multi-Label Legal Text Classification: A case study in EU Legislation
Title | Extreme Multi-Label Legal Text Classification: A case study in EU Legislation |
Authors | Ilias Chalkidis, Manos Fergadiotis, Prodromos Malakasiotis, Nikolaos Aletras, Ion Androutsopoulos |
Abstract | We consider the task of Extreme Multi-Label Text Classification (XMTC) in the legal domain. We release a new dataset of 57k legislative documents from EURLEX, the European Union’s public document database, annotated with concepts from EUROVOC, a multidisciplinary thesaurus. The dataset is substantially larger than previous EURLEX datasets and suitable for XMTC, few-shot and zero-shot learning. Experimenting with several neural classifiers, we show that BIGRUs with self-attention outperform the current multi-label state-of-the-art methods, which employ label-wise attention. Replacing CNNs with BIGRUs in label-wise attention networks leads to the best overall performance. |
Tasks | Multi-Label Text Classification, Text Classification, Zero-Shot Learning |
Published | 2019-05-26 |
URL | https://arxiv.org/abs/1905.10892v1 |
https://arxiv.org/pdf/1905.10892v1.pdf | |
PWC | https://paperswithcode.com/paper/extreme-multi-label-legal-text-classification |
Repo | |
Framework | |
Learning Visually Consistent Label Embeddings for Zero-Shot Learning
Title | Learning Visually Consistent Label Embeddings for Zero-Shot Learning |
Authors | Berkan Demirel, Ramazan Gokberk Cinbis, Nazli Ikizler-Cinbis |
Abstract | In this work, we propose a zero-shot learning method to effectively model knowledge transfer between classes via jointly learning visually consistent word vectors and label embedding model in an end-to-end manner. The main idea is to project the vector space word vectors of attributes and classes into the visual space such that word representations of semantically related classes become more closer, and use the projected vectors in the proposed embedding model to identify unseen classes. We evaluate the proposed approach on two benchmark datasets and the experimental results show that our method yields significant improvements in recognition accuracy. |
Tasks | Transfer Learning, Zero-Shot Learning |
Published | 2019-05-16 |
URL | https://arxiv.org/abs/1905.06764v1 |
https://arxiv.org/pdf/1905.06764v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-visually-consistent-label-embeddings |
Repo | |
Framework | |
Unified Generator-Classifier for Efficient Zero-Shot Learning
Title | Unified Generator-Classifier for Efficient Zero-Shot Learning |
Authors | Ayyappa Kumar Pambala, Titir Dutta, Soma Biswas |
Abstract | Generative models have achieved state-of-the-art performance for the zero-shot learning problem, but they require re-training the classifier every time a new object category is encountered. The traditional semantic embedding approaches, though very elegant, usually do not perform at par with their generative counterparts. In this work, we propose an unified framework termed GenClass, which integrates the generator with the classifier for efficient zero-shot learning, thus combining the representative power of the generative approaches and the elegance of the embedding approaches. End-to-end training of the unified framework not only eliminates the requirement of additional classifier for new object categories as in the generative approaches, but also facilitates the generation of more discriminative and useful features. Extensive evaluation on three standard zero-shot object classification datasets, namely AWA, CUB and SUN shows the effectiveness of the proposed approach. The approach without any modification, also gives state-of-the-art performance for zero-shot action classification, thus showing its generalizability to other domains. |
Tasks | Action Classification, Object Classification, Zero-Shot Learning |
Published | 2019-05-11 |
URL | https://arxiv.org/abs/1905.04511v1 |
https://arxiv.org/pdf/1905.04511v1.pdf | |
PWC | https://paperswithcode.com/paper/unified-generator-classifier-for-efficient |
Repo | |
Framework | |
Notes on Latent Structure Models and SPIGOT
Title | Notes on Latent Structure Models and SPIGOT |
Authors | André F. T. Martins, Vlad Niculae |
Abstract | These notes aim to shed light on the recently proposed structured projected intermediate gradient optimization technique (SPIGOT, Peng et al., 2018). SPIGOT is a variant of the straight-through estimator (Bengio et al., 2013) which bypasses gradients of the argmax function by back-propagating a surrogate “gradient.” We provide a new interpretation to the proposed gradient and put this technique into perspective, linking it to other methods for training neural networks with discrete latent variables. As a by-product, we suggest alternate variants of SPIGOT which will be further explored in future work. |
Tasks | |
Published | 2019-07-24 |
URL | https://arxiv.org/abs/1907.10348v1 |
https://arxiv.org/pdf/1907.10348v1.pdf | |
PWC | https://paperswithcode.com/paper/notes-on-latent-structure-models-and-spigot |
Repo | |
Framework | |
Map Enhanced Route Travel Time Prediction using Deep Neural Networks
Title | Map Enhanced Route Travel Time Prediction using Deep Neural Networks |
Authors | Soumi Das, Rajath Nandan Kalava, Kolli Kiran Kumar, Akhil Kandregula, Kalpam Suhaas, Sourangshu Bhattacharya, Niloy Ganguly |
Abstract | Travel time estimation is a fundamental problem in transportation science with extensive literature. The study of these techniques has intensified due to availability of many publicly available large trip datasets. Recently developed deep learning based models have improved the generality and performance and have focused on estimating times for individual sub-trajectories and aggregating them to predict the travel time of the entire trajectory. However, these techniques ignore the road network information. In this work, we propose and study techniques for incorporating road networks along with historical trips’ data into travel time prediction. We incorporate both node embeddings as well as road distance into the existing model. Experiments on large real-world benchmark datasets suggest improved performance, especially when the train data is small. As expected, the proposed method performs better than the baseline when there is a larger difference between road distance and Vincenty distance between start and end points. |
Tasks | |
Published | 2019-11-06 |
URL | https://arxiv.org/abs/1911.02623v1 |
https://arxiv.org/pdf/1911.02623v1.pdf | |
PWC | https://paperswithcode.com/paper/map-enhanced-route-travel-time-prediction |
Repo | |
Framework | |
Amortized Object and Scene Perception for Long-term Robot Manipulation
Title | Amortized Object and Scene Perception for Long-term Robot Manipulation |
Authors | Ferenc Balint-Benczedi, Michael Beetz |
Abstract | Mobile robots, performing long-term manipulation activities in human environments, have to perceive a wide variety of objects possessing very different visual characteristics and need to reliably keep track of these throughout the execution of a task. In order to be efficient, robot perception capabilities need to go beyond what is currently perceivable and should be able to answer queries about both current and past scenes. In this paper we investigate a perception system for long-term robot manipulation that keeps track of the changing environment and builds a representation of the perceived world. Specifically we introduce an amortized component that spreads perception tasks throughout the execution cycle. The resulting query driven perception system asynchronously integrates results from logged images into a symbolic and numeric (what we call sub-symbolic) representation that forms the perceptual belief state of the robot. |
Tasks | |
Published | 2019-03-28 |
URL | http://arxiv.org/abs/1903.12302v1 |
http://arxiv.org/pdf/1903.12302v1.pdf | |
PWC | https://paperswithcode.com/paper/amortized-object-and-scene-perception-for |
Repo | |
Framework | |
Essential Sentences for Navigating Stack Overflow Answers
Title | Essential Sentences for Navigating Stack Overflow Answers |
Authors | Sarah Nadi, Christoph Treude |
Abstract | Stack Overflow (SO) has become an essential resource for software development. Despite its success and prevalence, navigating SO remains a challenge. Ideally, SO users could benefit from highlighted navigational cues that help them decide if an answer is relevant to their task and context. Such navigational cues could be in the form of essential sentences that help the searcher decide whether they want to read the answer or skip over it. In this paper, we compare four potential approaches for identifying essential sentences. We adopt two existing approaches and develop two new approaches based on the idea that contextual information in a sentence (e.g., “if using windows”) could help identify essential sentences. We compare the four techniques using a survey of 43 participants. Our participants indicate that it is not always easy to figure out what the best solution for their specific problem is, given the options, and that they would indeed like to easily spot contextual information that may narrow down the search. Our quantitative comparison of the techniques shows that there is no single technique sufficient for identifying essential sentences that can serve as navigational cues, while our qualitative analysis shows that participants valued explanations and specific conditions, and did not value filler sentences or speculations. Our work sheds light on the importance of navigational cues, and our findings can be used to guide future research to find the best combination of techniques to identify such cues. |
Tasks | |
Published | 2019-12-31 |
URL | https://arxiv.org/abs/1912.13455v1 |
https://arxiv.org/pdf/1912.13455v1.pdf | |
PWC | https://paperswithcode.com/paper/essential-sentences-for-navigating-stack |
Repo | |
Framework | |
Busca de melhor caminho entre múltiplas origens e múltiplos destinos em redes complexas que representam cidades
Title | Busca de melhor caminho entre múltiplas origens e múltiplos destinos em redes complexas que representam cidades |
Authors | Daniel Aragão Abreu Filho |
Abstract | Was investigated in this paper the use of a search strategy in the problem of finding the best path among multiple origins and multiple destinations. In this kind of problem, it must be decided within a lot of combinations which is the best origin and the best destination, and also the best path between these two regions. One remarkable difficulty to answer this sort of problem is to perform the search in a reduced time. This monography is a extension of previous research in which the problem described here was studied only in a bus network in the city of Fortaleza. This extension consisted of an exploration of the search strategy in graphs that represent public ways in cities like Fortaleza, Mumbai and Tokyo. Using this strategy with a heuristic algorithm, Haversine distance, was noticed that is possible to reduce substantially the time of the search, but introducing an error because of the loss of the admissible characteristic of the heuristic function applied. |
Tasks | |
Published | 2019-12-18 |
URL | https://arxiv.org/abs/1912.09987v1 |
https://arxiv.org/pdf/1912.09987v1.pdf | |
PWC | https://paperswithcode.com/paper/busca-de-melhor-caminho-entre-multiplas |
Repo | |
Framework | |