April 2, 2020

3278 words 16 mins read

Paper Group ANR 114

Deep Neural Networks for the Correction of Mie Scattering in Fourier-Transformed Infrared Spectra of Biological Samples. Overview of the TREC 2019 deep learning track. Model-Agnostic Structured Sparsification with Learnable Channel Shuffle. Continuous Melody Generation via Disentangled Short-Term Representations and Structural Conditions. Using Dis …

Deep Neural Networks for the Correction of Mie Scattering in Fourier-Transformed Infrared Spectra of Biological Samples


Title	Deep Neural Networks for the Correction of Mie Scattering in Fourier-Transformed Infrared Spectra of Biological Samples
Authors	Arne P. Raulf, Joshua Butke, Lukas Menzen, Claus Küpper, Frederik Großerueschkamp, Klaus Gerwert, Axel Mosig
Abstract	Infrared spectra obtained from cell or tissue specimen have commonly been observed to involve a significant degree of (resonant) Mie scattering, which often overshadows biochemically relevant spectral information by a non-linear, non-additive spectral component in Fourier transformed infrared (FTIR) spectroscopic measurements. Correspondingly, many successful machine learning approaches for FTIR spectra have relied on preprocessing procedures that computationally remove the scattering components from an infrared spectrum. We propose an approach to approximate this complex preprocessing function using deep neural networks. As we demonstrate, the resulting model is not just several orders of magnitudes faster, which is important for real-time clinical applications, but also generalizes strongly across different tissue types. Furthermore, our proposed method overcomes the trade-off between computation time and the corrected spectrum being biased towards an artificial reference spectrum.
Tasks
Published	2020-02-18
URL	https://arxiv.org/abs/2002.07681v1
PDF	https://arxiv.org/pdf/2002.07681v1.pdf
PWC	https://paperswithcode.com/paper/deep-neural-networks-for-the-correction-of
Repo
Framework

Overview of the TREC 2019 deep learning track


Title	Overview of the TREC 2019 deep learning track
Authors	Nick Craswell, Bhaskar Mitra, Emine Yilmaz, Daniel Campos, Ellen M. Voorhees
Abstract	The Deep Learning Track is a new track for TREC 2019, with the goal of studying ad hoc ranking in a large data regime. It is the first track with large human-labeled training sets, introducing two sets corresponding to two tasks, each with rigorous TREC-style blind evaluation and reusable test sets. The document retrieval task has a corpus of 3.2 million documents with 367 thousand training queries, for which we generate a reusable test set of 43 queries. The passage retrieval task has a corpus of 8.8 million passages with 503 thousand training queries, for which we generate a reusable test set of 43 queries. This year 15 groups submitted a total of 75 runs, using various combinations of deep learning, transfer learning and traditional IR ranking methods. Deep learning runs significantly outperformed traditional IR runs. Possible explanations for this result are that we introduced large training data and we included deep models trained on such data in our judging pools, whereas some past studies did not have such training data or pooling.
Tasks	Transfer Learning
Published	2020-03-17
URL	https://arxiv.org/abs/2003.07820v2
PDF	https://arxiv.org/pdf/2003.07820v2.pdf
PWC	https://paperswithcode.com/paper/overview-of-the-trec-2019-deep-learning-track
Repo
Framework

Model-Agnostic Structured Sparsification with Learnable Channel Shuffle


Title	Model-Agnostic Structured Sparsification with Learnable Channel Shuffle
Authors	Xin-Yu Zhang, Kai Zhao, Taihong Xiao, Ming-Ming Cheng, Ming-Hsuan Yang
Abstract	Recent advances in convolutional neural networks (CNNs) usually come with the expense of considerable computational overhead and memory footprint. Network compression aims to alleviate this issue by training compact models with comparable performance. However, existing compression techniques either entail dedicated expert design or compromise with a moderate performance drop. To this end, we propose a model-agnostic structured sparsification method for efficient network compression. The proposed method automatically induces structurally sparse representations of the convolutional weights, thereby facilitating the implementation of the compressed model with the highly-optimized group convolution. We further address the problem of inter-group communication with a learnable channel shuffle mechanism. The proposed approach is model-agnostic and highly compressible with a negligible performance drop. Extensive experimental results and analysis demonstrate that our approach performs favorably against the state-of-the-art network pruning methods. The code will be publicly available after the review process.
Tasks	Network Pruning
Published	2020-02-19
URL	https://arxiv.org/abs/2002.08127v1
PDF	https://arxiv.org/pdf/2002.08127v1.pdf
PWC	https://paperswithcode.com/paper/model-agnostic-structured-sparsification-with
Repo
Framework

Continuous Melody Generation via Disentangled Short-Term Representations and Structural Conditions


Title	Continuous Melody Generation via Disentangled Short-Term Representations and Structural Conditions
Authors	Ke Chen, Gus Xia, Shlomo Dubnov
Abstract	Automatic music generation is an interdisciplinary research topic that combines computational creativity and semantic analysis of music to create automatic machine improvisations. An important property of such a system is allowing the user to specify conditions and desired properties of the generated music. In this paper we designed a model for composing melodies given a user specified symbolic scenario combined with a previous music context. We add manual labeled vectors denoting external music quality in terms of chord function that provides a low dimensional representation of the harmonic tension and resolution. Our model is capable of generating long melodies by regarding 8-beat note sequences as basic units, and shares consistent rhythm pattern structure with another specific song. The model contains two stages and requires separate training where the first stage adopts a Conditional Variational Autoencoder (C-VAE) to build a bijection between note sequences and their latent representations, and the second stage adopts long short-term memory networks (LSTM) with structural conditions to continue writing future melodies. We further exploit the disentanglement technique via C-VAE to allow melody generation based on pitch contour information separately from conditioning on rhythm patterns. Finally, we evaluate the proposed model using quantitative analysis of rhythm and the subjective listening study. Results show that the music generated by our model tends to have salient repetition structures, rich motives, and stable rhythm patterns. The ability to generate longer and more structural phrases from disentangled representations combined with semantic scenario specification conditions shows a broad application of our model.
Tasks	Music Generation
Published	2020-02-05
URL	https://arxiv.org/abs/2002.02393v1
PDF	https://arxiv.org/pdf/2002.02393v1.pdf
PWC	https://paperswithcode.com/paper/continuous-melody-generation-via-disentangled
Repo
Framework

Using Distributional Thesaurus Embedding for Co-hyponymy Detection


Title	Using Distributional Thesaurus Embedding for Co-hyponymy Detection
Authors	Abhik Jana, Nikhil Reddy Varimalla, Pawan Goyal
Abstract	Discriminating lexical relations among distributionally similar words has always been a challenge for natural language processing (NLP) community. In this paper, we investigate whether the network embedding of distributional thesaurus can be effectively utilized to detect co-hyponymy relations. By extensive experiments over three benchmark datasets, we show that the vector representation obtained by applying node2vec on distributional thesaurus outperforms the state-of-the-art models for binary classification of co-hyponymy vs. hypernymy, as well as co-hyponymy vs. meronymy, by huge margins.
Tasks	Network Embedding
Published	2020-02-24
URL	https://arxiv.org/abs/2002.11506v1
PDF	https://arxiv.org/pdf/2002.11506v1.pdf
PWC	https://paperswithcode.com/paper/using-distributional-thesaurus-embedding-for
Repo
Framework

Learning Class Regularized Features for Action Recognition


Title	Learning Class Regularized Features for Action Recognition
Authors	Alexandros Stergiou, Ronald Poppe, Remco C. Veltkamp
Abstract	Training Deep Convolutional Neural Networks (CNNs) is based on the notion of using multiple kernels and non-linearities in their subsequent activations to extract useful features. The kernels are used as general feature extractors without specific correspondence to the target class. As a result, the extracted features do not correspond to specific classes. Subtle differences between similar classes are modeled in the same way as large differences between dissimilar classes. To overcome the class-agnostic use of kernels in CNNs, we introduce a novel method named Class Regularization that performs class-based regularization of layer activations. We demonstrate that this not only improves feature search during training, but also allows an explicit assignment of features per class during each stage of the feature extraction process. We show that using Class Regularization blocks in state-of-the-art CNN architectures for action recognition leads to systematic improvement gains of 1.8%, 1.2% and 1.4% on the Kinetics, UCF-101 and HMDB-51 datasets, respectively.
Tasks
Published	2020-02-07
URL	https://arxiv.org/abs/2002.02651v1
PDF	https://arxiv.org/pdf/2002.02651v1.pdf
PWC	https://paperswithcode.com/paper/learning-class-regularized-features-for
Repo
Framework

A novel Deep Structure U-Net for Sea-Land Segmentation in Remote Sensing Images


Title	A novel Deep Structure U-Net for Sea-Land Segmentation in Remote Sensing Images
Authors	Pourya Shamsolmoali, Masoumeh Zareapoor, Ruili Wang, Huiyu Zhou, Jie Yang
Abstract	Sea-land segmentation is an important process for many key applications in remote sensing. Proper operative sea-land segmentation for remote sensing images remains a challenging issue due to complex and diverse transition between sea and lands. Although several Convolutional Neural Networks (CNNs) have been developed for sea-land segmentation, the performance of these CNNs is far from the expected target. This paper presents a novel deep neural network structure for pixel-wise sea-land segmentation, a Residual Dense U-Net (RDU-Net), in complex and high-density remote sensing images. RDU-Net is a combination of both down-sampling and up-sampling paths to achieve satisfactory results. In each down- and up-sampling path, in addition to the convolution layers, several densely connected residual network blocks are proposed to systematically aggregate multi-scale contextual information. Each dense network block contains multilevel convolution layers, short-range connections and an identity mapping connection which facilitates features re-use in the network and makes full use of the hierarchical features from the original images. These proposed blocks have a certain number of connections that are designed with shorter distance backpropagation between the layers and can significantly improve segmentation results whilst minimizing computational costs. We have performed extensive experiments on two real datasets Google Earth and ISPRS and compare the proposed RDUNet against several variations of Dense Networks. The experimental results show that RDUNet outperforms the other state-of-the-art approaches on the sea-land segmentation tasks.
Tasks
Published	2020-03-17
URL	https://arxiv.org/abs/2003.07784v1
PDF	https://arxiv.org/pdf/2003.07784v1.pdf
PWC	https://paperswithcode.com/paper/a-novel-deep-structure-u-net-for-sea-land
Repo
Framework

Graph Metric Learning via Gershgorin Disc Alignment


Title	Graph Metric Learning via Gershgorin Disc Alignment
Authors	Cheng Yang, Gene Cheung, Wei Hu
Abstract	We propose a fast general projection-free metric learning framework, where the minimization objective $\min_{\textbf{M} \in \mathcal{S}} Q(\textbf{M})$ is a convex differentiable function of the metric matrix $\textbf{M}$, and $\textbf{M}$ resides in the set $\mathcal{S}$ of generalized graph Laplacian matrices for connected graphs with positive edge weights and node degrees. Unlike low-rank metric matrices common in the literature, $\mathcal{S}$ includes the important positive-diagonal-only matrices as a special case in the limit. The key idea for fast optimization is to rewrite the positive definite cone constraint in $\mathcal{S}$ as signal-adaptive linear constraints via Gershgorin disc alignment, so that the alternating optimization of the diagonal and off-diagonal terms in $\textbf{M}$ can be solved efficiently as linear programs via Frank-Wolfe iterations. We prove that the Gershgorin discs can be aligned perfectly using the first eigenvector $\textbf{v}$ of $\textbf{M}$, which we update iteratively using Locally Optimal Block Preconditioned Conjugate Gradient (LOBPCG) with warm start as diagonal / off-diagonal terms are optimized. Experiments show that our efficiently computed graph metric matrices outperform metrics learned using competing methods in terms of classification tasks.
Tasks	Metric Learning
Published	2020-01-28
URL	https://arxiv.org/abs/2001.10485v4
PDF	https://arxiv.org/pdf/2001.10485v4.pdf
PWC	https://paperswithcode.com/paper/fast-graph-metric-learning-via-gershgorin
Repo
Framework

Performance Analysis and Optimization in Privacy-Preserving Federated Learning


Title	Performance Analysis and Optimization in Privacy-Preserving Federated Learning
Authors	Kang Wei, Jun Li, Ming Ding, Chuan Ma, Hang Su, Bo Zhang, H. Vincent Poor
Abstract	As a means of decentralized machine learning, federated learning (FL) has recently drawn considerable attentions. One of the prominent advantages of FL is its capability of preventing clients’ data from being directly exposed to external adversaries. Nevertheless, via a viewpoint of information theory, it is still possible for an attacker to steal private information from eavesdropping upon the shared models uploaded by FL clients. In order to address this problem, we develop a novel privacy preserving FL framework based on the concept of differential privacy (DP). To be specific, we first borrow the concept of local DP and introduce a client-level DP (CDP) by adding artificial noises to the shared models before uploading them to servers. Then, we prove that our proposed CDP algorithm can satisfy the DP guarantee with adjustable privacy protection levels by varying the variances of the artificial noises. More importantly, we derive a theoretical convergence upper-bound of the CDP algorithm. Our derived upper-bound reveals that there exists an optimal number of communication rounds to achieve the best convergence performance in terms of loss function values for a given privacy protection level. Furthermore, to obtain this optimal number of communication rounds, which cannot be derived in a closed-form expression, we propose a communication rounds discounting (CRD) method. Compared with the heuristic searching method, our proposed CRD can achieve a much better trade-off between the computational complexity of searching for the optimal number and the convergence performance. Extensive experiments indicate that our CDP algorithm with an optimization on the number of communication rounds using the proposed CRD can effectively improve both the FL training efficiency and FL model quality for a given privacy protection level.
Tasks
Published	2020-02-29
URL	https://arxiv.org/abs/2003.00229v1
PDF	https://arxiv.org/pdf/2003.00229v1.pdf
PWC	https://paperswithcode.com/paper/performance-analysis-and-optimization-in
Repo
Framework

DCDLearn: Multi-order Deep Cross-distance Learning for Vehicle Re-Identification


Title	DCDLearn: Multi-order Deep Cross-distance Learning for Vehicle Re-Identification
Authors	Rixing Zhu, Jianwu Fang, Hongke Xu, Hongkai Yu, Jianru Xue
Abstract	Vehicle re-identification (Re-ID) has become a popular research topic owing to its practicability in intelligent transportation systems. Vehicle Re-ID suffers the numerous challenges caused by drastic variation in illumination, occlusions, background, resolutions, viewing angles, and so on. To address it, this paper formulates a multi-order deep cross-distance learning (\textbf{DCDLearn}) model for vehicle re-identification, where an efficient one-view CycleGAN model is developed to alleviate exhaustive and enumerative cross-camera matching problem in previous works and smooth the domain discrepancy of cross cameras. Specially, we treat the transferred images and the reconstructed images generated by one-view CycleGAN as multi-order augmented data for deep cross-distance learning, where the cross distances of multi-order image set with distinct identities are learned by optimizing an objective function with multi-order augmented triplet loss and center loss to achieve the camera-invariance and identity-consistency. Extensive experiments on three vehicle Re-ID datasets demonstrate that the proposed method achieves significant improvement over the state-of-the-arts, especially for the small scale dataset.
Tasks	Vehicle Re-Identification
Published	2020-03-25
URL	https://arxiv.org/abs/2003.11315v2
PDF	https://arxiv.org/pdf/2003.11315v2.pdf
PWC	https://paperswithcode.com/paper/dcdlearn-multi-order-deep-cross-distance
Repo
Framework

Physical-Virtual Collaboration Graph Network for Station-Level Metro Ridership Prediction


Title	Physical-Virtual Collaboration Graph Network for Station-Level Metro Ridership Prediction
Authors	Jingwen Chen, Lingbo Liu, Hefeng Wu, Jiajie Zhen, Guanbin Li, Liang Lin
Abstract	Due to the widespread applications in real-world scenarios, metro ridership prediction is a crucial but challenging task in intelligent transportation systems. However, conventional methods that either ignored the topological information of metro systems or directly learned on physical topology, can not fully explore the ridership evolution patterns. To address this problem, we model a metro system as graphs with various topologies and propose a unified Physical-Virtual Collaboration Graph Network (PVCGN), which can effectively learn the complex ridership patterns from the tailor-designed graphs. Specifically, a physical graph is directly built based on the realistic topology of the studied metro system, while a similarity graph and a correlation graph are built with virtual topologies under the guidance of the inter-station passenger flow similarity and correlation. These complementary graphs are incorporated into a Graph Convolution Gated Recurrent Unit (GC-GRU) for spatial-temporal representation learning. Further, a Fully-Connected Gated Recurrent Unit (FC-GRU) is also applied to capture the global evolution tendency. Finally, we develop a seq2seq model with GC-GRU and FC-GRU to forecast the future metro ridership sequentially. Extensive experiments on two large-scale benchmarks (e.g., Shanghai Metro and Hangzhou Metro) well demonstrate the superiority of the proposed PVCGN for station-level metro ridership prediction.
Tasks	Representation Learning
Published	2020-01-14
URL	https://arxiv.org/abs/2001.04889v1
PDF	https://arxiv.org/pdf/2001.04889v1.pdf
PWC	https://paperswithcode.com/paper/physical-virtual-collaboration-graph-network
Repo
Framework

Similarità per la ricerca del dominio di una frase


Title	Similarità per la ricerca del dominio di una frase
Authors	Massimiliano Morrelli, Giacomo Pansini, Massimiliano Polito, Arturo Vitale
Abstract	English. This document aims to study the best algorithms to verify the belonging of a specific document to a related domain by comparing different methods for calculating the distance between two vectors. This study has been made possible with the help of the structures made available by the Apache Spark framework. Starting from the study illustrated in the publication “New frontier of textual classification: Big data and distributed calculus” by Massimiliano Morrelli et al., We wanted to carry out a study on the possible implementation of a solution capable of calculating the Similarity of a sentence using the distributed environment. Italiano. Il presente documento persegue l’obiettivo di studiare gli algoritmi migliori per verificare l’appartenenza di un determinato documento a un relativo dominio tramite un confronto di diversi metodi per il calcolo della distanza fra due vettori. Tale studio `e stato condotto con l’ausilio delle strutture messe a disposizione dal framework Apache Spark. Partendo dallo studio illustrato nella pubblicazione “Nuova frontiera della classificazione testuale: Big data e calcolo distribuito” di Massimiliano Morrelli et al., si `e voluto realizzare uno studio sulla possibile implementazione di una soluzione in grado di calcolare la Similarit`a di una frase sfruttando l’ambiente distribuito.
Tasks
Published	2020-01-31
URL	https://arxiv.org/abs/2002.00757v1
PDF	https://arxiv.org/pdf/2002.00757v1.pdf
PWC	https://paperswithcode.com/paper/similarita-per-la-ricerca-del-dominio-di-una
Repo
Framework

Looking GLAMORous: Vehicle Re-Id in Heterogeneous Cameras Networks with Global and Local Attention


Title	Looking GLAMORous: Vehicle Re-Id in Heterogeneous Cameras Networks with Global and Local Attention
Authors	Abhijit Suprem, Calton Pu
Abstract	Vehicle re-identification (re-id) is a fundamental problem for modern surveillance camera networks. Existing approaches for vehicle re-id utilize global features and local features for re-id by combining multiple subnetworks and losses. In this paper, we propose GLAMOR, or Global and Local Attention MOdules for Re-id. GLAMOR performs global and local feature extraction simultaneously in a unified model to achieve state-of-the-art performance in vehicle re-id across a variety of adversarial conditions and datasets (mAPs 80.34, 76.48, 77.15 on VeRi-776, VRIC, and VeRi-Wild, respectively). GLAMOR introduces several contributions: a better backbone construction method that outperforms recent approaches, group and layer normalization to address conflicting loss targets for re-id, a novel global attention module for global feature extraction, and a novel local attention module for self-guided part-based local feature extraction that does not require supervision. Additionally, GLAMOR is a compact and fast model that is 10x smaller while delivering 25% better performance.
Tasks	Vehicle Re-Identification
Published	2020-02-06
URL	https://arxiv.org/abs/2002.02256v1
PDF	https://arxiv.org/pdf/2002.02256v1.pdf
PWC	https://paperswithcode.com/paper/looking-glamorous-vehicle-re-id-in
Repo
Framework

MajorityNets: BNNs Utilising Approximate Popcount for Improved Efficiency


Title	MajorityNets: BNNs Utilising Approximate Popcount for Improved Efficiency
Authors	Seyedramin Rasoulinezhad, Sean Fox, Hao Zhou, Lingli Wang, David Boland, Philip H. W. Leong
Abstract	Binarized neural networks (BNNs) have shown exciting potential for utilising neural networks in embedded implementations where area, energy and latency constraints are paramount. With BNNs, multiply-accumulate (MAC) operations can be simplified to XnorPopcount operations, leading to massive reductions in both memory and computation resources. Furthermore, multiple efficient implementations of BNNs have been reported on field-programmable gate array (FPGA) implementations. This paper proposes a smaller, faster, more energy-efficient approximate replacement for the XnorPopcountoperation, called XNorMaj, inspired by state-of-the-art FPGAlook-up table schemes which benefit FPGA implementations. Weshow that XNorMaj is up to 2x more resource-efficient than the XnorPopcount operation. While the XNorMaj operation has a minor detrimental impact on accuracy, the resource savings enable us to use larger networks to recover the loss.
Tasks
Published	2020-02-27
URL	https://arxiv.org/abs/2002.12900v1
PDF	https://arxiv.org/pdf/2002.12900v1.pdf
PWC	https://paperswithcode.com/paper/majoritynets-bnns-utilising-approximate
Repo
Framework

The TrojAI Software Framework: An OpenSource tool for Embedding Trojans into Deep Learning Models


Title	The TrojAI Software Framework: An OpenSource tool for Embedding Trojans into Deep Learning Models
Authors	Kiran Karra, Chace Ashcraft, Neil Fendley
Abstract	In this paper, we introduce the TrojAI software framework, an open source set of Python tools capable of generating triggered (poisoned) datasets and associated deep learning (DL) models with trojans at scale. We utilize the developed framework to generate a large set of trojaned MNIST classifiers, as well as demonstrate the capability to produce a trojaned reinforcement-learning model using vector observations. Results on MNIST show that the nature of the trigger, training batch size, and dataset poisoning percentage all affect successful embedding of trojans. We test Neural Cleanse against the trojaned MNIST models and successfully detect anomalies in the trained models approximately $18%$ of the time. Our experiments and workflow indicate that the TrojAI software framework will enable researchers to easily understand the effects of various configurations of the dataset and training hyperparameters on the generated trojaned deep learning model, and can be used to rapidly and comprehensively test new trojan detection methods.
Tasks
Published	2020-03-13
URL	https://arxiv.org/abs/2003.07233v1
PDF	https://arxiv.org/pdf/2003.07233v1.pdf
PWC	https://paperswithcode.com/paper/the-trojai-software-framework-an-opensource
Repo
Framework