January 25, 2020

3404 words 16 mins read

Paper Group ANR 1646

Paper Group ANR 1646

Supervised Initialization of LSTM Networks for Fundamental Frequency Detection in Noisy Speech Signals. Application of Fuzzy Clustering for Text Data Dimensionality Reduction. Towards Fair and Decentralized Privacy-Preserving Deep Learning. StartNet: Online Detection of Action Start in Untrimmed Videos. Saliency Prediction on Omnidirectional Images …

Supervised Initialization of LSTM Networks for Fundamental Frequency Detection in Noisy Speech Signals

Title Supervised Initialization of LSTM Networks for Fundamental Frequency Detection in Noisy Speech Signals
Authors Marvin Coto-Jimenez
Abstract Fundamental frequency is one of the most important parameters of human speech, of importance for the classification of accent, gender, speaking styles, speaker identification, age, among others. The proper detection of this parameter remains as an important challenge for severely degraded signals. In previous references for detecting fundamental frequency in noisy speech using deep learning, the networks, such as Long Short-term Memory (LSTM) has been initialized with random weights, and then trained following a back-propagation through time algorithm. In this work, a proposal for a more efficient initialization, based on a supervised training using an Auto-associative network, is presented. This initialization is a better starting point for the detection of fundamental frequency in noisy speech. The advantages of this initialization are noticeable using objective measures for the accuracy of the detection and for the training of the networks, under the presence of additive white noise at different signal-to-noise levels.
Tasks Speaker Identification
Published 2019-11-11
URL https://arxiv.org/abs/1911.04580v1
PDF https://arxiv.org/pdf/1911.04580v1.pdf
PWC https://paperswithcode.com/paper/supervised-initialization-of-lstm-networks
Repo
Framework

Application of Fuzzy Clustering for Text Data Dimensionality Reduction

Title Application of Fuzzy Clustering for Text Data Dimensionality Reduction
Authors Amir Karami
Abstract Large textual corpora are often represented by the document-term frequency matrix whose elements are the frequency of terms; however, this matrix has two problems: sparsity and high dimensionality. Four dimension reduction strategies are used to address these problems. Of the four strategies, unsupervised feature transformation (UFT) is a popular and efficient strategy to map the terms to a new basis in the document-term frequency matrix. Although several UFT-based methods have been developed, fuzzy clustering has not been considered for dimensionality reduction. This research explores fuzzy clustering as a new UFT-based approach to create a lower-dimensional representation of documents. Performance of fuzzy clustering with and without using global term weighting methods is shown to exceed principal component analysis and singular value decomposition. This study also explores the effect of applying different fuzzifier values on fuzzy clustering for dimensionality reduction purpose.
Tasks Dimensionality Reduction
Published 2019-09-21
URL https://arxiv.org/abs/1909.10881v1
PDF https://arxiv.org/pdf/1909.10881v1.pdf
PWC https://paperswithcode.com/paper/application-of-fuzzy-clustering-for-text-data
Repo
Framework

Towards Fair and Decentralized Privacy-Preserving Deep Learning

Title Towards Fair and Decentralized Privacy-Preserving Deep Learning
Authors Lingjuan Lyu, Jiangshan Yu, Karthik Nandakumar, Yitong Li, Xingjun Ma, Jiong Jin
Abstract In current deep learning paradigms, the standalone framework tends to result in overfitting and low utility. This problem can be addressed by either a centralized framework that deploys a central server to train a global model on the joint data from all parties, or a distributed framework that leverages a parameter server to aggregate local model updates. Server-based frameworks unfortunately suffer from the single-point-of-failure problem, and the decentralized framework is born to be resistant to it. However, all the existing collaborative learning frameworks (distributed or decentralized) have overlooked an important aspect of participation: fairness. In particular, all parties can get similar models, even the ones merely making marginal contribution with low-quality data. To address this issue, we propose a decentralized privacy-preserving deep learning framework called DPPDL. It makes the first-ever investigation on the collaborative fairness in deep learning, and proposes two novel strategies to guarantee both fairness and privacy. We experimentally demonstrate that, on benchmark image datasets, fairness, privacy and accuracy in collaborative deep learning can now be effectively achieved at the same time by our proposed DPPDL. Moreover, it provides a viable solution to detect and reduce the impact of low-quality parties in the collaborative learning system.
Tasks Privacy Preserving Deep Learning
Published 2019-06-04
URL https://arxiv.org/abs/1906.01167v2
PDF https://arxiv.org/pdf/1906.01167v2.pdf
PWC https://paperswithcode.com/paper/towards-fair-and-decentralized-privacy
Repo
Framework

StartNet: Online Detection of Action Start in Untrimmed Videos

Title StartNet: Online Detection of Action Start in Untrimmed Videos
Authors Mingfei Gao, Mingze Xu, Larry S. Davis, Richard Socher, Caiming Xiong
Abstract We propose StartNet to address Online Detection of Action Start (ODAS) where action starts and their associated categories are detected in untrimmed, streaming videos. Previous methods aim to localize action starts by learning feature representations that can directly separate the start point from its preceding background. It is challenging due to the subtle appearance difference near the action starts and the lack of training data. Instead, StartNet decomposes ODAS into two stages: action classification (using ClsNet) and start point localization (using LocNet). ClsNet focuses on per-frame labeling and predicts action score distributions online. Based on the predicted action scores of the past and current frames, LocNet conducts class-agnostic start detection by optimizing long-term localization rewards using policy gradient methods. The proposed framework is validated on two large-scale datasets, THUMOS’14 and ActivityNet. The experimental results show that StartNet significantly outperforms the state-of-the-art by 15%-30% p-mAP under the offset tolerance of 1-10 seconds on THUMOS’14, and achieves comparable performance on ActivityNet with 10 times smaller time offset.
Tasks Action Classification, Policy Gradient Methods
Published 2019-03-23
URL http://arxiv.org/abs/1903.09868v1
PDF http://arxiv.org/pdf/1903.09868v1.pdf
PWC https://paperswithcode.com/paper/startnet-online-detection-of-action-start-in
Repo
Framework

Saliency Prediction on Omnidirectional Images with Generative Adversarial Imitation Learning

Title Saliency Prediction on Omnidirectional Images with Generative Adversarial Imitation Learning
Authors Mai Xu, Li Yang, Xiaoming Tao, Yiping Duan, Zulin Wang
Abstract When watching omnidirectional images (ODIs), subjects can access different viewports by moving their heads. Therefore, it is necessary to predict subjects’ head fixations on ODIs. Inspired by generative adversarial imitation learning (GAIL), this paper proposes a novel approach to predict saliency of head fixations on ODIs, named SalGAIL. First, we establish a dataset for attention on ODIs (AOI). In contrast to traditional datasets, our AOI dataset is large-scale, which contains the head fixations of 30 subjects viewing 600 ODIs. Next, we mine our AOI dataset and determine three findings: (1) The consistency of head fixations are consistent among subjects, and it grows alongside the increased subject number; (2) The head fixations exist with a front center bias (FCB); and (3) The magnitude of head movement is similar across subjects. According to these findings, our SalGAIL approach applies deep reinforcement learning (DRL) to predict the head fixations of one subject, in which GAIL learns the reward of DRL, rather than the traditional human-designed reward. Then, multi-stream DRL is developed to yield the head fixations of different subjects, and the saliency map of an ODI is generated via convoluting predicted head fixations. Finally, experiments validate the effectiveness of our approach in predicting saliency maps of ODIs, significantly better than 10 state-of-the-art approaches.
Tasks Imitation Learning, Saliency Prediction
Published 2019-04-15
URL http://arxiv.org/abs/1904.07080v1
PDF http://arxiv.org/pdf/1904.07080v1.pdf
PWC https://paperswithcode.com/paper/saliency-prediction-on-omnidirectional-images
Repo
Framework

A Novel Deep Learning Based Approach for Left Ventricle Segmentation in Echocardiography: MFP-Unet

Title A Novel Deep Learning Based Approach for Left Ventricle Segmentation in Echocardiography: MFP-Unet
Authors Shakiba Moradi, Mostafa Ghelich-Oghli, Azin Alizadehasl, Isaac Shiri, Niki Oveisi, Mehrdad Oveisi, Majid Maleki, Jan Dhooge
Abstract Segmentation of the Left ventricle (LV) is a crucial step for quantitative measurements such as area, volume, and ejection fraction. However, the automatic LV segmentation in 2D echocardiographic images is a challenging task due to ill-defined borders, and operator dependence issues (insufficient reproducibility). U-net, which is a well-known architecture in medical image segmentation, addressed this problem through an encoder-decoder path. Despite outstanding overall performance, U-net ignores the contribution of all semantic strengths in the segmentation procedure. In the present study, we have proposed a novel architecture to tackle this drawback. Feature maps in all levels of the decoder path of U-net are concatenated, their depths are equalized, and up-sampled to a fixed dimension. This stack of feature maps would be the input of the semantic segmentation layer. The proposed network yielded state-of-the-art results when comparing with results from U-net, dilated U-net, and deeplabv3, using the same dataset. An average Dice Metric (DM) of 0.945, Hausdorff Distance (HD) of 1.62, Jaccard Coefficient (JC) of 0.97, and Mean Absolute Distance (MAD) of 1.32 are achieved. The correlation graph, bland-altman analysis, and box plot showed a great agreement between automatic and manually calculated volume, area, and length.
Tasks Medical Image Segmentation, Semantic Segmentation
Published 2019-06-25
URL https://arxiv.org/abs/1906.10486v2
PDF https://arxiv.org/pdf/1906.10486v2.pdf
PWC https://paperswithcode.com/paper/mfp-unet-a-novel-deep-learning-based-approach
Repo
Framework

Interpretable Discriminative Dimensionality Reduction and Feature Selection on the Manifold

Title Interpretable Discriminative Dimensionality Reduction and Feature Selection on the Manifold
Authors Babak Hosseini, Barbara Hammer
Abstract Dimensionality reduction (DR) on the manifold includes effective methods which project the data from an implicit relational space onto a vectorial space. Regardless of the achievements in this area, these algorithms suffer from the lack of interpretation of the projection dimensions. Therefore, it is often difficult to explain the physical meaning behind the embedding dimensions. In this research, we propose the interpretable kernel DR algorithm (I-KDR) as a new algorithm which maps the data from the feature space to a lower dimensional space where the classes are more condensed with less overlapping. Besides, the algorithm creates the dimensions upon local contributions of the data samples, which makes it easier to interpret them by class labels. Additionally, we efficiently fuse the DR with feature selection task to select the most relevant features of the original space to the discriminative objective. Based on the empirical evidence, I-KDR provides better interpretations for embedding dimensions as well as higher discriminative performance in the embedded space compared to the state-of-the-art and popular DR algorithms.
Tasks Dimensionality Reduction, Feature Selection
Published 2019-09-19
URL https://arxiv.org/abs/1909.09218v1
PDF https://arxiv.org/pdf/1909.09218v1.pdf
PWC https://paperswithcode.com/paper/interpretable-discriminative-dimensionality
Repo
Framework

Deep Convolutional Neural Network-Based Autonomous Drone Navigation

Title Deep Convolutional Neural Network-Based Autonomous Drone Navigation
Authors K. Amer, M. Samy, M. Shaker, M. ElHelw
Abstract This paper presents a novel approach for aerial drone autonomous navigation along predetermined paths using only visual input form an onboard camera and without reliance on a Global Positioning System (GPS). It is based on using a deep Convolutional Neural Network (CNN) combined with a regressor to output the drone steering commands. Furthermore, multiple auxiliary navigation paths that form a navigation envelope are used for data augmentation to make the system adaptable to real-life deployment scenarios. The approach is suitable for automating drone navigation in applications that exhibit regular trips or visits to same locations such as environmental and desertification monitoring, parcel/aid delivery and drone-based wireless internet delivery. In this case, the proposed algorithm replaces human operators, enhances accuracy of GPS-based map navigation, alleviates problems related to GPS-spoofing and enables navigation in GPS-denied environments. Our system is tested in two scenarios using the Unreal Engine-based AirSim plugin for drone simulation with promising results of average cross track distance less than 1.4 meters and mean waypoints minimum distance of less than 1 meter.
Tasks Autonomous Navigation, Data Augmentation, Drone navigation
Published 2019-05-05
URL https://arxiv.org/abs/1905.01657v1
PDF https://arxiv.org/pdf/1905.01657v1.pdf
PWC https://paperswithcode.com/paper/deep-convolutional-neural-network-based
Repo
Framework

Gathering Cyber Threat Intelligence from Twitter Using Novelty Classification

Title Gathering Cyber Threat Intelligence from Twitter Using Novelty Classification
Authors Ba Dung Le, Guanhua Wang, Mehwish Nasim, Ali Babar
Abstract Preventing organizations from Cyber exploits needs timely intelligence about Cyber vulnerabilities and attacks, referred as threats. Cyber threat intelligence can be extracted from various sources including social media platforms where users publish the threat information in real time. Gathering Cyber threat intelligence from social media sites is a time consuming task for security analysts that can delay timely response to emerging Cyber threats. We propose a framework for automatically gathering Cyber threat intelligence from Twitter by using a novelty detection model. Our model learns the features of Cyber threat intelligence from the threat descriptions published in public repositories such as Common Vulnerabilities and Exposures (CVE) and classifies a new unseen tweet as either normal or anomalous to Cyber threat intelligence. We evaluate our framework using a purpose-built data set of tweets from 50 influential Cyber security related accounts over twelve months (in 2018). Our classifier achieves the F1-score of 0.643 for classifying Cyber threat tweets and outperforms several baselines including binary classification models. Our analysis of the classification results suggests that Cyber threat relevant tweets on Twitter do not often include the CVE identifier of the related threats. Hence, it would be valuable to collect these tweets and associate them with the related CVE identifier for cyber security applications.
Tasks
Published 2019-07-03
URL https://arxiv.org/abs/1907.01755v2
PDF https://arxiv.org/pdf/1907.01755v2.pdf
PWC https://paperswithcode.com/paper/gathering-cyber-threat-intelligence-from
Repo
Framework

IoU-balanced Loss Functions for Single-stage Object Detection

Title IoU-balanced Loss Functions for Single-stage Object Detection
Authors Shengkai Wu, Xiaoping Li
Abstract Single-stage detectors are efficient. However, we find that the loss functions adopted by single-stage detectors are sub-optimal for accurate localization. The standard cross entropy loss for classification is independent of localization task and drives all the positive examples to learn as high classification score as possible regardless of localization accuracy during training. As a result, there will be detections that have high classification score but low IoU or low classification score but high IoU. And the detections with low classification score but high IOU will be suppressed by the ones with high classification score but low IOU during NMS, hurting the localization accuracy. For the standard smooth L1 loss, the gradient is dominated by the outliers that have poorly localization accuracy and this is harmful for accurate localization. In this work, we propose IoU-balanced loss functions that consist of IoU-balanced classification loss and IoU-balanced localization loss to solve the above problems. The IoU-balanced classification loss focuses more attention on positive examples with high IOU and can enhance the correlation between classification and localization task. The IoU-balanced localization loss decreases the gradient of the examples with low IoU and increases the gradient of examples with high IoU, which can improve the localization accuracy of models. Sufficient studies on MS COCO demonstrate that both IoU-balanced classification loss and IoU-balanced localization loss can bring substantial improvement for the single-stage detectors. Without whistles and bells, the proposed methods can improve AP by 1.1% for single-stage detectors and the improvement for AP at higher IoU threshold is especially large, such as 2.3% for AP90. The source code will be made available.
Tasks Object Detection
Published 2019-08-15
URL https://arxiv.org/abs/1908.05641v1
PDF https://arxiv.org/pdf/1908.05641v1.pdf
PWC https://paperswithcode.com/paper/iou-balanced-loss-functions-for-single-stage
Repo
Framework

Feature Augmentation Improves Anomalous Change Detection for Human Activity Identification in Synthetic Aperture Radar Imagery

Title Feature Augmentation Improves Anomalous Change Detection for Human Activity Identification in Synthetic Aperture Radar Imagery
Authors Hannah J. Murphy, Christopher X. Ren, Matthew T. Calef
Abstract Anomalous change detection (ACD) methods separate common, uninteresting changes from rare, significant changes in co-registered images collected at different points in time. In this paper we evaluate methods to improve the performance of ACD in detecting human activity in SAR imagery using outdoor music festivals as a target. Our results show that the low dimensionality of SAR data leads to poor performance of ACD when compared to simpler methods such as image differencing, but augmenting the dimensionality of our input feature space by incorporating local spatial information leads to enhanced performance.
Tasks
Published 2019-12-07
URL https://arxiv.org/abs/1912.03539v1
PDF https://arxiv.org/pdf/1912.03539v1.pdf
PWC https://paperswithcode.com/paper/feature-augmentation-improves-anomalous
Repo
Framework

Knowledge Discovery In Nanophotonics Using Geometric Deep Learning

Title Knowledge Discovery In Nanophotonics Using Geometric Deep Learning
Authors Yashar Kiarashinejad, Mohammadreza Zandehshahvar, Sajjad Abdollahramezani, Omid Hemmatyar, Reza Pourabolghasem, Ali Adibi
Abstract We present here a new approach for using the intelligence aspects of artificial intelligence for knowledge discovery rather than device optimization in electromagnetic (EM) nanostructures. This approach uses training data obtained through full-wave EM simulations of a series of nanostructures to train geometric deep learning algorithms to assess the range of feasible responses as well as the feasibility of a desired response from a class of EM nanostructures. To facilitate the knowledge discovery and reduce the computation complexity, our approach combines the dimensionality reduction technique (using an autoencoder) with convex-hull and one-class support-vector-machine (SVM) algorithms to find the range of the feasible responses in the latent (or the reduced) response space of the EM nanostructure. We show that by using a small set of training instances (compared to all possible structures), our approach can provide better than 95% accuracy in assessing the feasibility of a given response. More importantly, the one-class SVM algorithm can be trained to provide the degree of feasibility (or unfeasibility) of a response from a given nanostructure. This important information can be used to modify the initial structure to an alternative one that can enable an initially unfeasible response. To show the applicability of our approach, we apply it to two important classes of binary metasurfaces (MSs), formed by array of plasmonic nanostructures, and periodic MSs formed by an array of dielectric nanopillars. In addition to theoretical results, we show the experimental results obtained by fabricating several MSs of the second class. Our theoretical and experimental results confirm the unique features of this approach for knowledge discovery in EM nanostructures.
Tasks Dimensionality Reduction
Published 2019-09-16
URL https://arxiv.org/abs/1909.07330v1
PDF https://arxiv.org/pdf/1909.07330v1.pdf
PWC https://paperswithcode.com/paper/knowledge-discovery-in-nanophotonics-using
Repo
Framework

AdaBits: Neural Network Quantization with Adaptive Bit-Widths

Title AdaBits: Neural Network Quantization with Adaptive Bit-Widths
Authors Qing Jin, Linjie Yang, Zhenyu Liao
Abstract Deep neural networks with adaptive configurations have gained increasing attention due to the instant and flexible deployment of these models on platforms with different resource budgets. In this paper, we investigate a novel option to achieve this goal by enabling adaptive bit-widths of weights and activations in the model. We first examine the benefits and challenges of training quantized model with adaptive bit-widths, and then experiment with several approaches including direct adaptation, progressive training and joint training. We discover that joint training is able to produce comparable performance on the adaptive model as individual models. We further propose a new technique named Switchable Clipping Level (S-CL) to further improve quantized models at the lowest bit-width. With our proposed techniques applied on a bunch of models including MobileNet-V1/V2 and ResNet-50, we demonstrate that bit-width of weights and activations is a new option for adaptively executable deep neural networks, offering a distinct opportunity for improved accuracy-efficiency trade-off as well as instant adaptation according to the platform constraints in real-world applications.
Tasks Quantization
Published 2019-12-20
URL https://arxiv.org/abs/1912.09666v2
PDF https://arxiv.org/pdf/1912.09666v2.pdf
PWC https://paperswithcode.com/paper/adabits-neural-network-quantization-with
Repo
Framework

An Exploration of Data Augmentation and Sampling Techniques for Domain-Agnostic Question Answering

Title An Exploration of Data Augmentation and Sampling Techniques for Domain-Agnostic Question Answering
Authors Shayne Longpre, Yi Lu, Zhucheng Tu, Chris DuBois
Abstract To produce a domain-agnostic question answering model for the Machine Reading Question Answering (MRQA) 2019 Shared Task, we investigate the relative benefits of large pre-trained language models, various data sampling strategies, as well as query and context paraphrases generated by back-translation. We find a simple negative sampling technique to be particularly effective, even though it is typically used for datasets that include unanswerable questions, such as SQuAD 2.0. When applied in conjunction with per-domain sampling, our XLNet (Yang et al., 2019)-based submission achieved the second best Exact Match and F1 in the MRQA leaderboard competition.
Tasks Data Augmentation, Question Answering, Reading Comprehension
Published 2019-12-04
URL https://arxiv.org/abs/1912.02145v1
PDF https://arxiv.org/pdf/1912.02145v1.pdf
PWC https://paperswithcode.com/paper/an-exploration-of-data-augmentation-and-1
Repo
Framework

Incorporating Dynamicity of Transportation Network with Multi-Weight Traffic Graph Convolution for Traffic Forecasting

Title Incorporating Dynamicity of Transportation Network with Multi-Weight Traffic Graph Convolution for Traffic Forecasting
Authors Yu Yol Shin, Yoonjin Yoon
Abstract Graph Convolutional Networks (GCN) have given the ability to model complex spatial and temporal dependencies in traffic data and improve the performance of predictions. In many studies, however, features that can represent the transportation networks such as speed limit, distance, and flow direction are overlooked. Learning without these structural features may not capture spatial dependencies and lead to low performance especially on roads with unusual characteristics. To address this challenge, we suggest a novel GCN structure that can incorporate multiple weights at the same time. The proposed model, Multi-Weight Traffic Graph Convolutional Networks (MW-TGC) conduct convolution operation on traffic data with multiple weighted adjacency matrices and combines the features obtained from each operation. The spatially isolated dimension reduction operation is conducted on the combined features to learn the dependencies among the features and reduce the size of output to a computationally feasible level. The output of multi-weight graph convolution is given to the Long Short-Term Memory (LSTM) to learn temporal dependencies. Experiment on two real-world datasets for 5min average speed of Seoul is conducted to evaluate the performance. The result shows that the proposed model outperforms the state-of-the-art models and reduces the inconsistency of prediction among roads with different characteristics.
Tasks Dimensionality Reduction
Published 2019-09-16
URL https://arxiv.org/abs/1909.07105v1
PDF https://arxiv.org/pdf/1909.07105v1.pdf
PWC https://paperswithcode.com/paper/incorporating-dynamicity-of-transportation
Repo
Framework
comments powered by Disqus