Paper Group ANR 191
Token-Level Ensemble Distillation for Grapheme-to-Phoneme Conversion. Preference-based Multiobjective Virtual Machine Placement: A Ceteris Paribus Approach. Efficient Motion Planning for Automated Lane Change based on Imitation Learning and Mixed-Integer Optimization. Named Entity Recognition for Electronic Health Records: A Comparison of Rule-base …
Token-Level Ensemble Distillation for Grapheme-to-Phoneme Conversion
Title | Token-Level Ensemble Distillation for Grapheme-to-Phoneme Conversion |
Authors | Hao Sun, Xu Tan, Jun-Wei Gan, Hongzhi Liu, Sheng Zhao, Tao Qin, Tie-Yan Liu |
Abstract | Grapheme-to-phoneme (G2P) conversion is an important task in automatic speech recognition and text-to-speech systems. Recently, G2P conversion is viewed as a sequence to sequence task and modeled by RNN or CNN based encoder-decoder framework. However, previous works do not consider the practical issues when deploying G2P model in the production system, such as how to leverage additional unlabeled data to boost the accuracy, as well as reduce model size for online deployment. In this work, we propose token-level ensemble distillation for G2P conversion, which can (1) boost the accuracy by distilling the knowledge from additional unlabeled data, and (2) reduce the model size but maintain the high accuracy, both of which are very practical and helpful in the online production system. We use token-level knowledge distillation, which results in better accuracy than the sequence-level counterpart. What is more, we adopt the Transformer instead of RNN or CNN based models to further boost the accuracy of G2P conversion. Experiments on the publicly available CMUDict dataset and an internal English dataset demonstrate the effectiveness of our proposed method. Particularly, our method achieves 19.88% WER on CMUDict dataset, outperforming the previous works by more than 4.22% WER, and setting the new state-of-the-art results. |
Tasks | Speech Recognition, Text-To-Speech Synthesis |
Published | 2019-04-06 |
URL | https://arxiv.org/abs/1904.03446v3 |
https://arxiv.org/pdf/1904.03446v3.pdf | |
PWC | https://paperswithcode.com/paper/token-level-ensemble-distillation-for |
Repo | |
Framework | |
Preference-based Multiobjective Virtual Machine Placement: A Ceteris Paribus Approach
Title | Preference-based Multiobjective Virtual Machine Placement: A Ceteris Paribus Approach |
Authors | Abdulaziz Alashaikh, Eisa Alanazi |
Abstract | This work adopts the notion of Ceteris Paribus (CP) as an interpretation of the Decision Maker (DM) preferences and incorporates it in a constrained multiobjective problem known as virtual machine placement (VMP). VMP is an essential multiobjective problem in the design and operation of cloud data centers concerned about placing each virtual machine to a physical machine (a server) in the data center. We analyze the effectiveness of CP interpretation on VMP problems and propose an NSGA-II variant with which preferred solutions are returned at almost no extra time cost. |
Tasks | |
Published | 2019-04-20 |
URL | http://arxiv.org/abs/1904.09477v1 |
http://arxiv.org/pdf/1904.09477v1.pdf | |
PWC | https://paperswithcode.com/paper/preference-based-multiobjective-virtual |
Repo | |
Framework | |
Efficient Motion Planning for Automated Lane Change based on Imitation Learning and Mixed-Integer Optimization
Title | Efficient Motion Planning for Automated Lane Change based on Imitation Learning and Mixed-Integer Optimization |
Authors | Chenyang Xi, Tianyu Shi |
Abstract | Intelligent motion planning is one of the core components in automated vehicles, which has received extensive interests. Traditional motion planning methods suffer from several drawbacks in terms of optimality, efficiency and generalization capability. Sampling based methods cannot guarantee the optimality of the generated trajectories. Whereas the optimization-based methods are not able to perform motion planning in real-time, and limited by the simplified formalization. In this work, we propose a learning-based approach to handle those shortcomings. Mixed Integer Quadratic Problem based optimization (MIQP) is used to generate the optimal lane-change trajectories which served as the training dataset for learning-based action generation algorithms. A hierarchical supervised learning model is devised to make the fast lane-change decision. Numerous experiments have been conducted to evaluate the optimality, efficiency, and generalization capability of the proposed approach. The experimental results indicate that the proposed model outperforms several commonly used motion planning baselines. |
Tasks | Autonomous Driving, Imitation Learning, Motion Planning |
Published | 2019-04-18 |
URL | https://arxiv.org/abs/1904.08784v3 |
https://arxiv.org/pdf/1904.08784v3.pdf | |
PWC | https://paperswithcode.com/paper/a-data-driven-approach-for-motion-planning-of |
Repo | |
Framework | |
Named Entity Recognition for Electronic Health Records: A Comparison of Rule-based and Machine Learning Approaches
Title | Named Entity Recognition for Electronic Health Records: A Comparison of Rule-based and Machine Learning Approaches |
Authors | Philip John Gorinski, Honghan Wu, Claire Grover, Richard Tobin, Conn Talbot, Heather Whalley, Cathie Sudlow, William Whiteley, Beatrice Alex |
Abstract | This work investigates multiple approaches to Named Entity Recognition (NER) for text in Electronic Health Record (EHR) data. In particular, we look into the application of (i) rule-based, (ii) deep learning and (iii) transfer learning systems for the task of NER on brain imaging reports with a focus on records from patients with stroke. We explore the strengths and weaknesses of each approach, develop rules and train on a common dataset, and evaluate each system’s performance on common test sets of Scottish radiology reports from two sources (brain imaging reports in ESS – Edinburgh Stroke Study data collected by NHS Lothian as well as radiology reports created in NHS Tayside). Our comparison shows that a hand-crafted system is the most accurate way to automatically label EHR, but machine learning approaches can provide a feasible alternative where resources for a manual system are not readily available. |
Tasks | Named Entity Recognition, Transfer Learning |
Published | 2019-03-10 |
URL | https://arxiv.org/abs/1903.03985v2 |
https://arxiv.org/pdf/1903.03985v2.pdf | |
PWC | https://paperswithcode.com/paper/named-entity-recognition-for-electronic |
Repo | |
Framework | |
A General Framework for Edited Video and Raw Video Summarization
Title | A General Framework for Edited Video and Raw Video Summarization |
Authors | Xuelong Li, Bin Zhao, Xiaoqiang Lu |
Abstract | In this paper, we build a general summarization framework for both of edited video and raw video summarization. Overall, our work can be divided into three folds: 1) Four models are designed to capture the properties of video summaries, i.e., containing important people and objects (importance), representative to the video content (representativeness), no similar key-shots (diversity) and smoothness of the storyline (storyness). Specifically, these models are applicable to both edited videos and raw videos. 2) A comprehensive score function is built with the weighted combination of the aforementioned four models. Note that the weights of the four models in the score function, denoted as property-weight, are learned in a supervised manner. Besides, the property-weights are learned for edited videos and raw videos, respectively. 3) The training set is constructed with both edited videos and raw videos in order to make up the lack of training data. Particularly, each training video is equipped with a pair of mixing-coefficients which can reduce the structure mess in the training set caused by the rough mixture. We test our framework on three datasets, including edited videos, short raw videos and long raw videos. Experimental results have verified the effectiveness of the proposed framework. |
Tasks | Video Summarization |
Published | 2019-04-24 |
URL | http://arxiv.org/abs/1904.10669v1 |
http://arxiv.org/pdf/1904.10669v1.pdf | |
PWC | https://paperswithcode.com/paper/a-general-framework-for-edited-video-and-raw |
Repo | |
Framework | |
Video Object Segmentation and Tracking: A Survey
Title | Video Object Segmentation and Tracking: A Survey |
Authors | Rui Yao, Guosheng Lin, Shixiong Xia, Jiaqi Zhao, Yong Zhou |
Abstract | Object segmentation and object tracking are fundamental research area in the computer vision community. These two topics are diffcult to handle some common challenges, such as occlusion, deformation, motion blur, and scale variation. The former contains heterogeneous object, interacting object, edge ambiguity, and shape complexity. And the latter suffers from difficulties in handling fast motion, out-of-view, and real-time processing. Combining the two problems of video object segmentation and tracking (VOST) can overcome their respective difficulties and improve their performance. VOST can be widely applied to many practical applications such as video summarization, high definition video compression, human computer interaction, and autonomous vehicles. This article aims to provide a comprehensive review of the state-of-the-art tracking methods, and classify these methods into different categories, and identify new trends. First, we provide a hierarchical categorization existing approaches, including unsupervised VOS, semi-supervised VOS, interactive VOS, weakly supervised VOS, and segmentation-based tracking methods. Second, we provide a detailed discussion and overview of the technical characteristics of the different methods. Third, we summarize the characteristics of the related video dataset, and provide a variety of evaluation metrics. Finally, we point out a set of interesting future works and draw our own conclusions. |
Tasks | Autonomous Vehicles, Object Tracking, Semantic Segmentation, Video Compression, Video Object Segmentation, Video Semantic Segmentation, Video Summarization |
Published | 2019-04-19 |
URL | http://arxiv.org/abs/1904.09172v3 |
http://arxiv.org/pdf/1904.09172v3.pdf | |
PWC | https://paperswithcode.com/paper/video-object-segmentation-and-tracking-a |
Repo | |
Framework | |
D-GAN: Deep Generative Adversarial Nets for Spatio-Temporal Prediction
Title | D-GAN: Deep Generative Adversarial Nets for Spatio-Temporal Prediction |
Authors | Divya Saxena, Jiannong Cao |
Abstract | Spatio-temporal (ST) data for urban applications, such as taxi demand, traffic flow, regional rainfall is inherently stochastic and unpredictable. Recently, deep learning based ST prediction models are proposed to learn the ST characteristics of data. However, it is still very challenging (1) to adequately learn the complex and non-linear ST relationships; (2) to model the high variations in the ST data volumes as it is inherently dynamic, changing over time (i.e., irregular) and highly influenced by many external factors, such as adverse weather, accidents, traffic control, PoI, etc.; and (3) as there can be many complicated external factors that can affect the accuracy and it is impossible to list them explicitly. To handle the aforementioned issues, in this paper, we propose a novel deep generative adversarial network based model (named, D-GAN) for more accurate ST prediction by implicitly learning ST feature representations in an unsupervised manner. D-GAN adopts a GAN-based structure and jointly learns generation and variational inference of data. More specifically, D-GAN consists of two major parts: (1) a deep ST feature learning network to model the ST correlations and semantic variations, and underlying factors of variations and irregularity in the data through the implicit distribution modelling; (2) a fusion module to incorporate external factors for reaching a better inference. To the best our knowledge, no prior work studies ST prediction problem via deep implicit generative model and in an unsupervised manner. Extensive experiments performed on two real-world datasets show that D-GAN achieves more accurate results than traditional as well as deep learning based ST prediction methods. |
Tasks | |
Published | 2019-07-19 |
URL | https://arxiv.org/abs/1907.08556v2 |
https://arxiv.org/pdf/1907.08556v2.pdf | |
PWC | https://paperswithcode.com/paper/d-gan-deep-generative-adversarial-nets-for |
Repo | |
Framework | |
RRAM based neuromorphic algorithms
Title | RRAM based neuromorphic algorithms |
Authors | Roshan Gopalakrishnan |
Abstract | This submission is a report on RRAM based neuromorphic algorithms. This report basically gives an overview of the algorithms implemented on neuromorphic hardware with crossbar array of RRAM synapses. This report mainly talks about the work on deep neural network to spiking neural network conversion and its significance. |
Tasks | |
Published | 2019-01-12 |
URL | http://arxiv.org/abs/1903.02519v1 |
http://arxiv.org/pdf/1903.02519v1.pdf | |
PWC | https://paperswithcode.com/paper/rram-based-neuromorphic-algorithms |
Repo | |
Framework | |
Uncovering Relations for Marketing Knowledge Representation
Title | Uncovering Relations for Marketing Knowledge Representation |
Authors | Somak Aditya, Atanu Sinha |
Abstract | Online behaviors of consumers and marketers generate massive marketing data, which ever more sophisticated models attempt to turn into insights and aid decisions by marketers. Yet, in making decisions human managers bring to bear marketing knowledge which reside outside of data and models. Thus, it behooves creation of an automated marketing knowledge base that can interact with data and models. Currently, marketing knowledge is dispersed in large corpora, but no definitive knowledge base for marketing exists. Out of the two broad aspects of marketing knowledge - representation and reasoning - this treatise focuses on the former. Specifically, we focus on creation of marketing knowledge graph from corpora, which requires identification of entities and relations. The relation identification task is particularly challenging in marketing, because of the non-factoid nature of much marketing knowledge, and the difficulty of forming rules that govern relations. Specifically, we define a set of relations to capture marketing knowledge, propose a pipeline for creating the knowledge graph from text and propose a rule-guided semi-supervised relation prediction algorithm to extract relations between marketing entities from sentences. |
Tasks | |
Published | 2019-12-18 |
URL | https://arxiv.org/abs/1912.08374v3 |
https://arxiv.org/pdf/1912.08374v3.pdf | |
PWC | https://paperswithcode.com/paper/uncovering-relations-for-marketing-knowledge |
Repo | |
Framework | |
Box-level Segmentation Supervised Deep Neural Networks for Accurate and Real-time Multispectral Pedestrian Detection
Title | Box-level Segmentation Supervised Deep Neural Networks for Accurate and Real-time Multispectral Pedestrian Detection |
Authors | Yanpeng Cao, Dayan Guan, Yulun Wu, Jiangxin Yang, Yanlong Cao, Michael Ying Yang |
Abstract | Effective fusion of complementary information captured by multi-modal sensors (visible and infrared cameras) enables robust pedestrian detection under various surveillance situations (e.g. daytime and nighttime). In this paper, we present a novel box-level segmentation supervised learning framework for accurate and real-time multispectral pedestrian detection by incorporating features extracted in visible and infrared channels. Specifically, our method takes pairs of aligned visible and infrared images with easily obtained bounding box annotations as input and estimates accurate prediction maps to highlight the existence of pedestrians. It offers two major advantages over the existing anchor box based multispectral detection methods. Firstly, it overcomes the hyperparameter setting problem occurred during the training phase of anchor box based detectors and can obtain more accurate detection results, especially for small and occluded pedestrian instances. Secondly, it is capable of generating accurate detection results using small-size input images, leading to improvement of computational efficiency for real-time autonomous driving applications. Experimental results on KAIST multispectral dataset show that our proposed method outperforms state-of-the-art approaches in terms of both accuracy and speed. |
Tasks | Autonomous Driving, Pedestrian Detection |
Published | 2019-02-14 |
URL | http://arxiv.org/abs/1902.05291v1 |
http://arxiv.org/pdf/1902.05291v1.pdf | |
PWC | https://paperswithcode.com/paper/box-level-segmentation-supervised-deep-neural |
Repo | |
Framework | |
Fundamental aspects of noise in analog-hardware neural networks
Title | Fundamental aspects of noise in analog-hardware neural networks |
Authors | Nadezhda Semenova, Xavier Porte, Louis Andreoli, Maxime Jacquot, Laurent Larger, Daniel Brunner |
Abstract | We study and analyze the fundamental aspects of noise propagation in recurrent as well as deep, multi-layer networks. The main focus of our study are neural networks in analogue hardware, yet the methodology provides insight for networks in general. The system under study consists of noisy linear nodes, and we investigate the signal-to-noise ratio at the network’s outputs which is the upper limit to such a system’s computing accuracy. We consider additive and multiplicative noise which can be purely local as well as correlated across populations of neurons. This covers the chief internal-perturbations of hardware networks and noise amplitudes were obtained from a physically implemented recurrent neural network and therefore correspond to a real-world system. Analytic solutions agree exceptionally well with numerical data, enabling clear identification of the most critical components and aspects for noise management. Focusing on linear nodes isolates the impact of network connections and allows us to derive strategies for mitigating noise. Our work is the starting point in addressing this aspect of analogue neural networks, and our results identify notoriously sensitive points while simultaneously highlighting the robustness of such computational systems. |
Tasks | |
Published | 2019-07-21 |
URL | https://arxiv.org/abs/1907.09002v1 |
https://arxiv.org/pdf/1907.09002v1.pdf | |
PWC | https://paperswithcode.com/paper/fundamental-aspects-of-noise-in-analog |
Repo | |
Framework | |
A Robust Comparison of the KDDCup99 and NSL-KDD IoT Network Intrusion Detection Datasets Through Various Machine Learning Algorithms
Title | A Robust Comparison of the KDDCup99 and NSL-KDD IoT Network Intrusion Detection Datasets Through Various Machine Learning Algorithms |
Authors | Suchet Sapre, Pouyan Ahmadi, Khondkar Islam |
Abstract | In recent years, as intrusion attacks on IoT networks have grown exponentially, there is an immediate need for sophisticated intrusion detection systems (IDSs). A vast majority of current IDSs are data-driven, which means that one of the most important aspects of this area of research is the quality of the data acquired from IoT network traffic. Two of the most cited intrusion detection datasets are the KDDCup99 and the NSL-KDD. The main goal of our project was to conduct a robust comparison of both datasets by evaluating the performance of various Machine Learning (ML) classifiers trained on them with a larger set of classification metrics than previous researchers. From our research, we were able to conclude that the NSL-KDD dataset is of a higher quality than the KDDCup99 dataset as the classifiers trained on it were on average 20.18% less accurate. This is because the classifiers trained on the KDDCup99 dataset exhibited a bias towards the redundancies within it, allowing them to achieve higher accuracies. |
Tasks | Intrusion Detection, Network Intrusion Detection |
Published | 2019-12-31 |
URL | https://arxiv.org/abs/1912.13204v1 |
https://arxiv.org/pdf/1912.13204v1.pdf | |
PWC | https://paperswithcode.com/paper/a-robust-comparison-of-the-kddcup99-and-nsl |
Repo | |
Framework | |
Attribute Aware Pooling for Pedestrian Attribute Recognition
Title | Attribute Aware Pooling for Pedestrian Attribute Recognition |
Authors | Kai Han, Yunhe Wang, Han Shu, Chuanjian Liu, Chunjing Xu, Chang Xu |
Abstract | This paper expands the strength of deep convolutional neural networks (CNNs) to the pedestrian attribute recognition problem by devising a novel attribute aware pooling algorithm. Existing vanilla CNNs cannot be straightforwardly applied to handle multi-attribute data because of the larger label space as well as the attribute entanglement and correlations. We tackle these challenges that hampers the development of CNNs for multi-attribute classification by fully exploiting the correlation between different attributes. The multi-branch architecture is adopted for fucusing on attributes at different regions. Besides the prediction based on each branch itself, context information of each branch are employed for decision as well. The attribute aware pooling is developed to integrate both kinds of information. Therefore, attributes which are indistinct or tangled with others can be accurately recognized by exploiting the context information. Experiments on benchmark datasets demonstrate that the proposed pooling method appropriately explores and exploits the correlations between attributes for the pedestrian attribute recognition. |
Tasks | Pedestrian Attribute Recognition |
Published | 2019-07-27 |
URL | https://arxiv.org/abs/1907.11837v1 |
https://arxiv.org/pdf/1907.11837v1.pdf | |
PWC | https://paperswithcode.com/paper/attribute-aware-pooling-for-pedestrian |
Repo | |
Framework | |
Unified Acceleration of High-Order Algorithms under Hölder Continuity and Uniform Convexity
Title | Unified Acceleration of High-Order Algorithms under Hölder Continuity and Uniform Convexity |
Authors | Chaobing Song, Yong Jiang, Yi Ma |
Abstract | In this paper, through a very intuitive vanilla proximal method perspective, we derive accelerated high-order optimization algorithms for minimizing a convex function that has H"{o}lder continuous derivatives. In this general convex setting, we propose a concise unified acceleration framework (UAF), which reconciles the two different high-order acceleration approaches, one by Nesterov and Baes [29, 3, 33] and one by Monteiro and Svaiter [25]. As result, the UAF unifies the high-order acceleration instances [29, 3, 33, 15, 16, 25, 19, 6, 14] of the two approaches by only two problem-related parameters and two additional parameters for framework design. Furthermore, the UAF (and its analysis) is the first approach to make high-order methods applicable for high-order smoothness conditions with respect to non-Euclidean norms. If the function is further uniformly convex, we propose a general restart scheme for the UAF. The iteration complexities of instances of both the UAF and the restarted UAF match existing lower bounds in most important cases [2, 16]. For practical implementation, we introduce a new and effective heuristic that significantly simplifies the binary search procedure required by the framework. We use experiments to verify the effectiveness of the heuristic and demonstrate clear and consistent advantages of high-order acceleration methods over first-order ones, in terms of run-time complexity. Finally, the UAF is proposed directly in the general composite convex setting, thus show that the existing high-order algorithms [29, 3, 33, 16, 6, 14] can be naturally extended to the general composite convex setting. |
Tasks | |
Published | 2019-06-03 |
URL | https://arxiv.org/abs/1906.00582v2 |
https://arxiv.org/pdf/1906.00582v2.pdf | |
PWC | https://paperswithcode.com/paper/190600582 |
Repo | |
Framework | |
Network Intrusion Detection based on LSTM and Feature Embedding
Title | Network Intrusion Detection based on LSTM and Feature Embedding |
Authors | Hyeokmin Gwon, Chungjun Lee, Rakun Keum, Heeyoul Choi |
Abstract | Growing number of network devices and services have led to increasing demand for protective measures as hackers launch attacks to paralyze or steal information from victim systems. Intrusion Detection System (IDS) is one of the essential elements of network perimeter security which detects the attacks by inspecting network traffic packets or operating system logs. While existing works demonstrated effectiveness of various machine learning techniques, only few of them utilized the time-series information of network traffic data. Also, categorical information has not been included in neural network based approaches. In this paper, we propose network intrusion detection models based on sequential information using long short-term memory (LSTM) network and categorical information using the embedding technique. We have experimented the models with UNSW-NB15, which is a comprehensive network traffic dataset. The experiment results confirm that the proposed method improve the performance, observing binary classification accuracy of 99.72%. |
Tasks | Intrusion Detection, Network Intrusion Detection, Time Series |
Published | 2019-11-26 |
URL | https://arxiv.org/abs/1911.11552v1 |
https://arxiv.org/pdf/1911.11552v1.pdf | |
PWC | https://paperswithcode.com/paper/network-intrusion-detection-based-on-lstm-and |
Repo | |
Framework | |