Paper Group ANR 1737
Deep convolutional autoencoder for cryptocurrency market analysis. Clinical Text Generation through Leveraging Medical Concept and Relations. Privacy-Enhancing Context Authentication from Location-Sensitive Data. Convolutional Neural Network for Intrusion Detection System In Cyber Physical Systems. Learning Single Camera Depth Estimation using Dual …
Deep convolutional autoencoder for cryptocurrency market analysis
Title | Deep convolutional autoencoder for cryptocurrency market analysis |
Authors | Vladimir Puzyrev |
Abstract | This study attempts to analyze patterns in cryptocurrency markets using a special type of deep neural networks, namely a convolutional autoencoder. The method extracts the dominant features of market behavior and classifies the 40 studied cryptocurrencies into several classes for twelve 6-month periods starting from 15th May 2013. Transitions from one class to another with time are related to the maturement of cryptocurrencies. In speculative cryptocurrency markets, these findings have potential implications for investment and trading strategies. |
Tasks | |
Published | 2019-10-27 |
URL | https://arxiv.org/abs/1910.12281v1 |
https://arxiv.org/pdf/1910.12281v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-convolutional-autoencoder-for |
Repo | |
Framework | |
Clinical Text Generation through Leveraging Medical Concept and Relations
Title | Clinical Text Generation through Leveraging Medical Concept and Relations |
Authors | Wangjin Lee, Hyeryun Park, Jooyoung Yoon, Kyeongmo Kim, Jinwook Choi |
Abstract | With a neural sequence generation model, this study aims to develop a method of writing the patient clinical texts given a brief medical history. As a proof-of-a-concept, we have demonstrated that it can be workable to use medical concept embedding in clinical text generation. Our model was based on the Sequence-to-Sequence architecture and trained with a large set of de-identified clinical text data. The quantitative result shows that our concept embedding method decreased the perplexity of the baseline architecture. Also, we discuss the analyzed results from a human evaluation performed by medical doctors. |
Tasks | Text Generation |
Published | 2019-10-02 |
URL | https://arxiv.org/abs/1910.00861v1 |
https://arxiv.org/pdf/1910.00861v1.pdf | |
PWC | https://paperswithcode.com/paper/clinical-text-generation-through-leveraging |
Repo | |
Framework | |
Privacy-Enhancing Context Authentication from Location-Sensitive Data
Title | Privacy-Enhancing Context Authentication from Location-Sensitive Data |
Authors | Pradip Mainali, Carlton Shepherd, Fabien A. P. Petitcolas |
Abstract | This paper proposes a new privacy-enhancing, context-aware user authentication system, ConSec, which uses a transformation of general location-sensitive data, such as GPS location, barometric altitude and noise levels, collected from the user’s device, into a representation based on locality-sensitive hashing (LSH). The resulting hashes provide a dimensionality reduction of the underlying data, which we leverage to model users’ behaviour for authentication using machine learning. We present how ConSec supports learning from categorical and numerical data, while addressing a number of on-device and network-based threats. ConSec is implemented subsequently for the Android platform and evaluated using data collected from 35 users, which is followed by a security and privacy analysis. We demonstrate that LSH presents a useful approach for context authentication from location-sensitive data without directly utilising plain measurements. |
Tasks | Dimensionality Reduction |
Published | 2019-04-18 |
URL | https://arxiv.org/abs/1904.08800v2 |
https://arxiv.org/pdf/1904.08800v2.pdf | |
PWC | https://paperswithcode.com/paper/privacy-enhancing-context-authentication-from |
Repo | |
Framework | |
Convolutional Neural Network for Intrusion Detection System In Cyber Physical Systems
Title | Convolutional Neural Network for Intrusion Detection System In Cyber Physical Systems |
Authors | Gael Kamdem De Teyou, Junior Ziazet |
Abstract | The extensive use of Information and Communication Technology in critical infrastructures such as Industrial Control Systems make them vulnerable to cyber-attacks. One particular class of cyber-attacks is advanced persistent threats where highly skilled attackers can steal user authentication information’s and move in the network from host to host until a valuable target is reached. The detection of the attacker should occur as soon as possible in order to take appropriate response, otherwise the attacker will have enough time to reach sensitive assets. When facing intelligent threats, intelligent solutions have to be designed. Therefore, in this paper, we take advantage of recent progress in deep learning to build a convolutional neural networks that can detect intrusions in cyber physical system. The Intrusion Detection System is applied on the NSL-KDD dataset and the performances of the proposed approach are presented and compared with the state of art. Results show the effectiveness of the techniques. |
Tasks | Intrusion Detection |
Published | 2019-05-08 |
URL | https://arxiv.org/abs/1905.03168v2 |
https://arxiv.org/pdf/1905.03168v2.pdf | |
PWC | https://paperswithcode.com/paper/convolutional-neural-network-for-intrusion |
Repo | |
Framework | |
Learning Single Camera Depth Estimation using Dual-Pixels
Title | Learning Single Camera Depth Estimation using Dual-Pixels |
Authors | Rahul Garg, Neal Wadhwa, Sameer Ansari, Jonathan T. Barron |
Abstract | Deep learning techniques have enabled rapid progress in monocular depth estimation, but their quality is limited by the ill-posed nature of the problem and the scarcity of high quality datasets. We estimate depth from a single camera by leveraging the dual-pixel auto-focus hardware that is increasingly common on modern camera sensors. Classic stereo algorithms and prior learning-based depth estimation techniques under-perform when applied on this dual-pixel data, the former due to too-strong assumptions about RGB image matching, and the latter due to not leveraging the understanding of optics of dual-pixel image formation. To allow learning based methods to work well on dual-pixel imagery, we identify an inherent ambiguity in the depth estimated from dual-pixel cues, and develop an approach to estimate depth up to this ambiguity. Using our approach, existing monocular depth estimation techniques can be effectively applied to dual-pixel data, and much smaller models can be constructed that still infer high quality depth. To demonstrate this, we capture a large dataset of in-the-wild 5-viewpoint RGB images paired with corresponding dual-pixel data, and show how view supervision with this data can be used to learn depth up to the unknown ambiguities. On our new task, our model is 30% more accurate than any prior work on learning-based monocular or stereoscopic depth estimation. |
Tasks | Depth Estimation, Monocular Depth Estimation |
Published | 2019-04-11 |
URL | https://arxiv.org/abs/1904.05822v3 |
https://arxiv.org/pdf/1904.05822v3.pdf | |
PWC | https://paperswithcode.com/paper/learning-single-camera-depth-estimation-using |
Repo | |
Framework | |
Deep Radiomics for Brain Tumor Detection and Classification from Multi-Sequence MRI
Title | Deep Radiomics for Brain Tumor Detection and Classification from Multi-Sequence MRI |
Authors | Subhashis Banerjee, Sushmita Mitra, Francesco Masulli, Stefano Rovetta |
Abstract | Glioma constitutes 80% of malignant primary brain tumors and is usually classified as HGG and LGG. The LGG tumors are less aggressive, with slower growth rate as compared to HGG, and are responsive to therapy. Tumor biopsy being challenging for brain tumor patients, noninvasive imaging techniques like Magnetic Resonance Imaging (MRI) have been extensively employed in diagnosing brain tumors. Therefore automated systems for the detection and prediction of the grade of tumors based on MRI data becomes necessary for assisting doctors in the framework of augmented intelligence. In this paper, we thoroughly investigate the power of Deep ConvNets for classification of brain tumors using multi-sequence MR images. We propose novel ConvNet models, which are trained from scratch, on MRI patches, slices, and multi-planar volumetric slices. The suitability of transfer learning for the task is next studied by applying two existing ConvNets models (VGGNet and ResNet) trained on ImageNet dataset, through fine-tuning of the last few layers. LOPO testing, and testing on the holdout dataset are used to evaluate the performance of the ConvNets. Results demonstrate that the proposed ConvNets achieve better accuracy in all cases where the model is trained on the multi-planar volumetric dataset. Unlike conventional models, it obtains a testing accuracy of 95% for the low/high grade glioma classification problem. A score of 97% is generated for classification of LGG with/without 1p/19q codeletion, without any additional effort towards extraction and selection of features. We study the properties of self-learned kernels/ filters in different layers, through visualization of the intermediate layer outputs. We also compare the results with that of state-of-the-art methods, demonstrating a maximum improvement of 7% on the grading performance of ConvNets and 9% on the prediction of 1p/19q codeletion status. |
Tasks | Transfer Learning |
Published | 2019-03-21 |
URL | http://arxiv.org/abs/1903.09240v1 |
http://arxiv.org/pdf/1903.09240v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-radiomics-for-brain-tumor-detection-and |
Repo | |
Framework | |
Fine-Grained Static Detection of Obfuscation Transforms Using Ensemble-Learning and Semantic Reasoning
Title | Fine-Grained Static Detection of Obfuscation Transforms Using Ensemble-Learning and Semantic Reasoning |
Authors | Ramtine Tofighi-Shirazi, Irina Mariuca Asavoae, Philippe Elbaz-Vincent |
Abstract | The ability to efficiently detect the software protections used is at a prime to facilitate the selection and application of adequate deob-fuscation techniques. We present a novel approach that combines semantic reasoning techniques with ensemble learning classification for the purpose of providing a static detection framework for obfuscation transformations. By contrast to existing work, we provide a methodology that can detect multiple layers of obfuscation, without depending on knowledge of the underlying functionality of the training-set used. We also extend our work to detect constructions of obfuscation transformations, thus providing a fine-grained methodology. To that end, we provide several studies for the best practices of the use of machine learning techniques for a scalable and efficient model. According to our experimental results and evaluations on obfuscators such as Tigress and OLLVM, our models have up to 91% accuracy on state-of-the-art obfuscation transformations. Our overall accuracies for their constructions are up to 100%. |
Tasks | |
Published | 2019-11-18 |
URL | https://arxiv.org/abs/1911.07523v1 |
https://arxiv.org/pdf/1911.07523v1.pdf | |
PWC | https://paperswithcode.com/paper/fine-grained-static-detection-of-obfuscation |
Repo | |
Framework | |
CLaRO: a Data-driven CNL for Specifying Competency Questions
Title | CLaRO: a Data-driven CNL for Specifying Competency Questions |
Authors | C. Maria Keet, Zola Mahlaza, Mary-Jane Antia |
Abstract | Competency Questions (CQs) for an ontology and similar artefacts aim to provide insights into the contents of an ontology and to demarcate its scope. The absence of a controlled natural language, tooling and automation to support the authoring of CQs has hampered their effective use in ontology development and evaluation. The few question templates that exists are based on informal analyses of a small number of CQs and have limited coverage of question types and sentence constructions. We aim to fill this gap by proposing a template-based CNL to author CQs, called CLaRO. For its design, we exploited a new dataset of 234 CQs that had been processed automatically into 106 patterns, which we analysed and used to design a template-based CNL, with an additional CNL model and XML serialisation. The CNL was evaluated with a subset of questions from the original dataset and with two sets of newly sourced CQs. The coverage of CLaRO, with its 93 main templates and 41 linguistic variants, is about 90% for unseen questions. CLaRO has the potential to facilitate streamlining formalising ontology content requirements and, given that about one third of the competency questions in the test sets turned out to be invalid questions, assist in writing good questions. |
Tasks | |
Published | 2019-07-17 |
URL | https://arxiv.org/abs/1907.07378v1 |
https://arxiv.org/pdf/1907.07378v1.pdf | |
PWC | https://paperswithcode.com/paper/claro-a-data-driven-cnl-for-specifying |
Repo | |
Framework | |
End-to-end Text-to-speech for Low-resource Languages by Cross-Lingual Transfer Learning
Title | End-to-end Text-to-speech for Low-resource Languages by Cross-Lingual Transfer Learning |
Authors | Tao Tu, Yuan-Jui Chen, Cheng-chieh Yeh, Hung-yi Lee |
Abstract | End-to-end text-to-speech (TTS) has shown great success on large quantities of paired text plus speech data. However, laborious data collection remains difficult for at least 95% of the languages over the world, which hinders the development of TTS in different languages. In this paper, we aim to build TTS systems for such low-resource (target) languages where only very limited paired data are available. We show such TTS can be effectively constructed by transferring knowledge from a high-resource (source) language. Since the model trained on source language cannot be directly applied to target language due to input space mismatch, we propose a method to learn a mapping between source and target linguistic symbols. Benefiting from this learned mapping, pronunciation information can be preserved throughout the transferring procedure. Preliminary experiments show that we only need around 15 minutes of paired data to obtain a relatively good TTS system. Furthermore, analytic studies demonstrated that the automatically discovered mapping correlate well with the phonetic expertise. |
Tasks | Cross-Lingual Transfer, Transfer Learning |
Published | 2019-04-13 |
URL | https://arxiv.org/abs/1904.06508v2 |
https://arxiv.org/pdf/1904.06508v2.pdf | |
PWC | https://paperswithcode.com/paper/end-to-end-text-to-speech-for-low-resource |
Repo | |
Framework | |
Cross-Domain Ambiguity Detection using Linear Transformation of Word Embedding Spaces
Title | Cross-Domain Ambiguity Detection using Linear Transformation of Word Embedding Spaces |
Authors | Vaibhav Jain, Ruchika Malhotra, Sanskar Jain, Nishant Tanwar |
Abstract | The requirements engineering process is a crucial stage of the software development life cycle. It involves various stakeholders from different professional backgrounds, particularly in the requirements elicitation phase. Each stakeholder carries distinct domain knowledge, causing them to differently interpret certain words, leading to cross-domain ambiguity. This can result in misunderstanding amongst them and jeopardize the entire project. This paper proposes a natural language processing approach to find potentially ambiguous words for a given set of domains. The idea is to apply linear transformations on word embedding models trained on different domain corpora, to bring them into a unified embedding space. The approach then finds words with divergent embeddings as they signify a variation in the meaning across the domains. It can help a requirements analyst in preventing misunderstandings during elicitation interviews and meetings by defining a set of potentially ambiguous terms in advance. The paper also discusses certain problems with the existing approaches and discusses how the proposed approach resolves them. |
Tasks | |
Published | 2019-10-28 |
URL | https://arxiv.org/abs/1910.12956v3 |
https://arxiv.org/pdf/1910.12956v3.pdf | |
PWC | https://paperswithcode.com/paper/cross-domain-ambiguity-detection-using-linear |
Repo | |
Framework | |
Weight Normalization based Quantization for Deep Neural Network Compression
Title | Weight Normalization based Quantization for Deep Neural Network Compression |
Authors | Wen-Pu Cai, Wu-Jun Li |
Abstract | With the development of deep neural networks, the size of network models becomes larger and larger. Model compression has become an urgent need for deploying these network models to mobile or embedded devices. Model quantization is a representative model compression technique. Although a lot of quantization methods have been proposed, many of them suffer from a high quantization error caused by a long-tail distribution of network weights. In this paper, we propose a novel quantization method, called weight normalization based quantization (WNQ), for model compression. WNQ adopts weight normalization to avoid the long-tail distribution of network weights and subsequently reduces the quantization error. Experiments on CIFAR-100 and ImageNet show that WNQ can outperform other baselines to achieve state-of-the-art performance. |
Tasks | Model Compression, Neural Network Compression, Quantization |
Published | 2019-07-01 |
URL | https://arxiv.org/abs/1907.00593v1 |
https://arxiv.org/pdf/1907.00593v1.pdf | |
PWC | https://paperswithcode.com/paper/weight-normalization-based-quantization-for |
Repo | |
Framework | |
Simple, Scalable Adaptation for Neural Machine Translation
Title | Simple, Scalable Adaptation for Neural Machine Translation |
Authors | Ankur Bapna, Naveen Arivazhagan, Orhan Firat |
Abstract | Fine-tuning pre-trained Neural Machine Translation (NMT) models is the dominant approach for adapting to new languages and domains. However, fine-tuning requires adapting and maintaining a separate model for each target task. We propose a simple yet efficient approach for adaptation in NMT. Our proposed approach consists of injecting tiny task specific adapter layers into a pre-trained model. These lightweight adapters, with just a small fraction of the original model size, adapt the model to multiple individual tasks simultaneously. We evaluate our approach on two tasks: (i) Domain Adaptation and (ii) Massively Multilingual NMT. Experiments on domain adaptation demonstrate that our proposed approach is on par with full fine-tuning on various domains, dataset sizes and model capacities. On a massively multilingual dataset of 103 languages, our adaptation approach bridges the gap between individual bilingual models and one massively multilingual model for most language pairs, paving the way towards universal machine translation. |
Tasks | Domain Adaptation, Machine Translation |
Published | 2019-09-18 |
URL | https://arxiv.org/abs/1909.08478v1 |
https://arxiv.org/pdf/1909.08478v1.pdf | |
PWC | https://paperswithcode.com/paper/simple-scalable-adaptation-for-neural-machine |
Repo | |
Framework | |
Crowd Sourced Data Analysis: Mapping of Programming Concepts to Syntactical Patterns
Title | Crowd Sourced Data Analysis: Mapping of Programming Concepts to Syntactical Patterns |
Authors | Deepak Thukral, Darvesh Punia |
Abstract | Since programming concepts do not match their syntactic representations, code search is a very tedious task. For instance in Java or C, array doesn’t match [], so using “array” as a query, one cannot find what they are looking for. Often developers have to search code whether to understand any code, or to reuse some part of that code, or just to read it, without natural language searching, developers have to often scroll back and forth or use variable names as their queries. In our work, we have used Stackoverflow (SO) question and answers to make a mapping of programming concepts with their respective natural language keywords, and then tag these natural language terms to every line of code, which can further we used in searching using natural language keywords. |
Tasks | Code Search |
Published | 2019-03-28 |
URL | http://arxiv.org/abs/1903.12495v1 |
http://arxiv.org/pdf/1903.12495v1.pdf | |
PWC | https://paperswithcode.com/paper/crowd-sourced-data-analysis-mapping-of |
Repo | |
Framework | |
Annotated Guidelines and Building Reference Corpus for Myanmar-English Word Alignment
Title | Annotated Guidelines and Building Reference Corpus for Myanmar-English Word Alignment |
Authors | Nway Nway Han, Aye Thida |
Abstract | Reference corpus for word alignment is an important resource for developing and evaluating word alignment methods. For Myanmar-English language pairs, there is no reference corpus to evaluate the word alignment tasks. Therefore, we created the guidelines for Myanmar-English word alignment annotation between two languages over contrastive learning and built the Myanmar-English reference corpus consisting of verified alignments from Myanmar ALT of the Asian Language Treebank (ALT). This reference corpus contains confident labels sure (S) and possible (P) for word alignments which are used to test for the purpose of evaluation of the word alignments tasks. We discuss the most linking ambiguities to define consistent and systematic instructions to align manual words. We evaluated the results of annotators agreement using our reference corpus in terms of alignment error rate (AER) in word alignment tasks and discuss the words relationships in terms of BLEU scores. |
Tasks | Word Alignment |
Published | 2019-09-25 |
URL | https://arxiv.org/abs/1909.11288v1 |
https://arxiv.org/pdf/1909.11288v1.pdf | |
PWC | https://paperswithcode.com/paper/annotated-guidelines-and-building-reference |
Repo | |
Framework | |
Jointly Learning to Align and Translate with Transformer Models
Title | Jointly Learning to Align and Translate with Transformer Models |
Authors | Sarthak Garg, Stephan Peitz, Udhyakumar Nallasamy, Matthias Paulik |
Abstract | The state of the art in machine translation (MT) is governed by neural approaches, which typically provide superior translation accuracy over statistical approaches. However, on the closely related task of word alignment, traditional statistical word alignment models often remain the go-to solution. In this paper, we present an approach to train a Transformer model to produce both accurate translations and alignments. We extract discrete alignments from the attention probabilities learnt during regular neural machine translation model training and leverage them in a multi-task framework to optimize towards translation and alignment objectives. We demonstrate that our approach produces competitive results compared to GIZA++ trained IBM alignment models without sacrificing translation accuracy and outperforms previous attempts on Transformer model based word alignment. Finally, by incorporating IBM model alignments into our multi-task training, we report significantly better alignment accuracies compared to GIZA++ on three publicly available data sets. |
Tasks | Machine Translation, Word Alignment |
Published | 2019-09-04 |
URL | https://arxiv.org/abs/1909.02074v1 |
https://arxiv.org/pdf/1909.02074v1.pdf | |
PWC | https://paperswithcode.com/paper/jointly-learning-to-align-and-translate-with |
Repo | |
Framework | |