January 31, 2020

3262 words 16 mins read

Paper Group ANR 97

Shared Predictive Cross-Modal Deep Quantization. Can We Derive Explicit and Implicit Bias from Corpus?. A Deterministic Approach to Avoid Saddle Points. Real-time processing of high-resolution video and 3D model-based tracking for remote towers. Abusive Language Detection with Graph Convolutional Networks. A Mean-Field Theory for Kernel Alignment w …


Title	Shared Predictive Cross-Modal Deep Quantization
Authors	Erkun Yang, Cheng Deng, Chao Li, Wei Liu, Jie Li, Dacheng Tao
Abstract	With explosive growth of data volume and ever-increasing diversity of data modalities, cross-modal similarity search, which conducts nearest neighbor search across different modalities, has been attracting increasing interest. This paper presents a deep compact code learning solution for efficient cross-modal similarity search. Many recent studies have proven that quantization-based approaches perform generally better than hashing-based approaches on single-modal similarity search. In this paper, we propose a deep quantization approach, which is among the early attempts of leveraging deep neural networks into quantization-based cross-modal similarity search. Our approach, dubbed shared predictive deep quantization (SPDQ), explicitly formulates a shared subspace across different modalities and two private subspaces for individual modalities, and representations in the shared subspace and the private subspaces are learned simultaneously by embedding them to a reproducing kernel Hilbert space, where the mean embedding of different modality distributions can be explicitly compared. In addition, in the shared subspace, a quantizer is learned to produce the semantics preserving compact codes with the help of label alignment. Thanks to this novel network architecture in cooperation with supervised quantization training, SPDQ can preserve intramodal and intermodal similarities as much as possible and greatly reduce quantization error. Experiments on two popular benchmarks corroborate that our approach outperforms state-of-the-art methods.
Tasks	Quantization
Published	2019-04-16
URL	http://arxiv.org/abs/1904.07488v1
PDF	http://arxiv.org/pdf/1904.07488v1.pdf
PWC	https://paperswithcode.com/paper/shared-predictive-cross-modal-deep
Repo
Framework

Can We Derive Explicit and Implicit Bias from Corpus?


Title	Can We Derive Explicit and Implicit Bias from Corpus?
Authors	Bo Wang, Baixiang Xue, Anthony G. Greenwald
Abstract	Language is a popular resource to mine speakers’ attitude bias, supposing that speakers’ statements represent their bias on concepts. However, psychology studies show that people’s explicit bias in statements can be different from their implicit bias in mind. Although both explicit and implicit bias are useful for different applications, current automatic techniques do not distinguish them. Inspired by psychological measurements of explicit and implicit bias, we develop an automatic language-based technique to reproduce psychological measurements on large population. By connecting each psychological measurement with the statements containing the certain combination of special words, we derive explicit and implicit bias by understanding the sentiment of corresponding category of statements. Extensive experiments on English and Chinese serious media (Wikipedia) and non-serious media (social media) show that our method successfully reproduce the small-scale psychological observations on large population and achieve new findings.
Tasks
Published	2019-05-31
URL	https://arxiv.org/abs/1905.13364v1
PDF	https://arxiv.org/pdf/1905.13364v1.pdf
PWC	https://paperswithcode.com/paper/can-we-derive-explicit-and-implicit-bias-from
Repo
Framework

A Deterministic Approach to Avoid Saddle Points


Title	A Deterministic Approach to Avoid Saddle Points
Authors	Lisa Maria Kreusser, Stanley J. Osher, Bao Wang
Abstract	Loss functions with a large number of saddle points are one of the main obstacles to training many modern machine learning models. Gradient descent (GD) is a fundamental algorithm for machine learning and converges to a saddle point for certain initial data. We call the region formed by these initial values the “attraction region.” For quadratic functions, GD converges to a saddle point if the initial data is in a subspace of up to n-1 dimensions. In this paper, we prove that a small modification of the recently proposed Laplacian smoothing gradient descent (LSGD) [Osher, et al., arXiv:1806.06317] contributes to avoiding saddle points without sacrificing the convergence rate of GD. In particular, we show that the dimension of the LSGD’s attraction region is at most floor((n-1)/2) for a class of quadratic functions which is significantly smaller than GD’s (n-1)-dimensional attraction region.
Tasks
Published	2019-01-21
URL	http://arxiv.org/abs/1901.06827v1
PDF	http://arxiv.org/pdf/1901.06827v1.pdf
PWC	https://paperswithcode.com/paper/a-deterministic-approach-to-avoid-saddle
Repo
Framework

Real-time processing of high-resolution video and 3D model-based tracking for remote towers


Title	Real-time processing of high-resolution video and 3D model-based tracking for remote towers
Authors	Oliver J. D. Barrowclough, Sverre Briseid, Georg Muntingh, Torbjørn Viksand
Abstract	High quality video data is a core component in emerging remote tower operations as it inherently contains a huge amount of information on which an air traffic controller can base decisions. Various digital technologies also have the potential to exploit this data to bring enhancements, including tracking ground movements by relating events in the video view to their positions in 3D space. The total resolution of remote tower setups with multiple cameras often exceeds 25 million RGB pixels and is captured at 30 frames per second or more. It is thus a challenge to efficiently process all the data in such a way as to provide relevant real-time enhancements to the controller. In this paper we discuss how a number of improvements can be implemented efficiently on a single workstation by decoupling processes and utilizing hardware for parallel computing. We also highlight how decoupling the processes in this way increases resilience of the software solution in the sense that failure of a single component does not impair the function of the other components.
Tasks
Published	2019-10-08
URL	https://arxiv.org/abs/1910.03517v2
PDF	https://arxiv.org/pdf/1910.03517v2.pdf
PWC	https://paperswithcode.com/paper/real-time-processing-of-high-resolution-video
Repo
Framework

Abusive Language Detection with Graph Convolutional Networks


Title	Abusive Language Detection with Graph Convolutional Networks
Authors	Pushkar Mishra, Marco Del Tredici, Helen Yannakoudakis, Ekaterina Shutova
Abstract	Abuse on the Internet represents a significant societal problem of our time. Previous research on automated abusive language detection in Twitter has shown that community-based profiling of users is a promising technique for this task. However, existing approaches only capture shallow properties of online communities by modeling follower-following relationships. In contrast, working with graph convolutional networks (GCNs), we present the first approach that captures not only the structure of online communities but also the linguistic behavior of the users within them. We show that such a heterogeneous graph-structured modeling of communities significantly advances the current state of the art in abusive language detection.
Tasks
Published	2019-04-05
URL	http://arxiv.org/abs/1904.04073v1
PDF	http://arxiv.org/pdf/1904.04073v1.pdf
PWC	https://paperswithcode.com/paper/abusive-language-detection-with-graph
Repo
Framework

A Mean-Field Theory for Kernel Alignment with Random Features in Generative and Discriminative Models


Title	A Mean-Field Theory for Kernel Alignment with Random Features in Generative and Discriminative Models
Authors	Masoud Badiei Khuzani, Liyue Shen, Shahin Shahrampour, Lei Xing
Abstract	We propose a novel supervised learning method to optimize the kernel in the maximum mean discrepancy generative adversarial networks (MMD GANs), and the kernel support vector machines (SVMs). Specifically, we characterize a distributionally robust optimization problem to compute a good distribution for the random feature model of Rahimi and Recht. Due to the fact that the distributional optimization is infinite dimensional, we consider a Monte-Carlo sample average approximation (SAA) to obtain a more tractable finite dimensional optimization problem. We subsequently leverage a particle stochastic gradient descent (SGD) method to solve the derived finite dimensional optimization problem. Based on a mean-field analysis, we then prove that the empirical distribution of the interactive particles system at each iteration of the SGD follows the path of the gradient descent flow on the Wasserstein manifold. We also establish the non-asymptotic consistency of the finite sample estimator. We evaluate our kernel learning method for the hypothesis testing problem by evaluating the kernel MMD statistics, and show that our learning method indeed attains better power of the test for larger threshold values compared to an untrained kernel. Moreover, our empirical evaluation on benchmark data-sets shows the advantage of our kernel learning approach compared to alternative kernel learning methods.
Tasks
Published	2019-09-25
URL	https://arxiv.org/abs/1909.11820v3
PDF	https://arxiv.org/pdf/1909.11820v3.pdf
PWC	https://paperswithcode.com/paper/a-mean-field-theory-for-kernel-alignment-with
Repo
Framework

Transfer Learning for Ultrasound Tongue Contour Extraction with Different Domains


Title	Transfer Learning for Ultrasound Tongue Contour Extraction with Different Domains
Authors	M. Hamed Mozaffari, Won-Sook Lee
Abstract	Medical ultrasound technology is widely used in routine clinical applications such as disease diagnosis and treatment as well as other applications like real-time monitoring of human tongue shapes and motions as visual feedback in second language training. Due to the low-contrast characteristic and noisy nature of ultrasound images, it might require expertise for non-expert users to recognize tongue gestures. Manual tongue segmentation is a cumbersome, subjective, and error-prone task. Furthermore, it is not a feasible solution for real-time applications. In the last few years, deep learning methods have been used for delineating and tracking tongue dorsum. Deep convolutional neural networks (DCNNs), which have shown to be successful in medical image analysis tasks, are typically weak for the same task on different domains. In many cases, DCNNs trained on data acquired with one ultrasound device, do not perform well on data of varying ultrasound device or acquisition protocol. Domain adaptation is an alternative solution for this difficulty by transferring the weights from the model trained on a large annotated legacy dataset to a new model for adapting on another different dataset using fine-tuning. In this study, after conducting extensive experiments, we addressed the problem of domain adaptation on small ultrasound datasets for tongue contour extraction. We trained a U-net network comprises of an encoder-decoder path from scratch, and then with several surrogate scenarios, some parts of the trained network were fine-tuned on another dataset as the domain-adapted networks. We repeat scenarios from target to source domains to find a balance point for knowledge transfer from source to target and vice versa. The performance of new fine-tuned networks was evaluated on the same task with images from different domains.
Tasks	Domain Adaptation, Transfer Learning
Published	2019-06-10
URL	https://arxiv.org/abs/1906.04301v1
PDF	https://arxiv.org/pdf/1906.04301v1.pdf
PWC	https://paperswithcode.com/paper/transfer-learning-for-ultrasound-tongue
Repo
Framework

Communication-Efficient Distributed Blockwise Momentum SGD with Error-Feedback


Title	Communication-Efficient Distributed Blockwise Momentum SGD with Error-Feedback
Authors	Shuai Zheng, Ziyue Huang, James T. Kwok
Abstract	Communication overhead is a major bottleneck hampering the scalability of distributed machine learning systems. Recently, there has been a surge of interest in using gradient compression to improve the communication efficiency of distributed neural network training. Using 1-bit quantization, signSGD with majority vote achieves a 32x reduction on communication cost. However, its convergence is based on unrealistic assumptions and can diverge in practice. In this paper, we propose a general distributed compressed SGD with Nesterov’s momentum. We consider two-way compression, which compresses the gradients both to and from workers. Convergence analysis on nonconvex problems for general gradient compressors is provided. By partitioning the gradient into blocks, a blockwise compressor is introduced such that each gradient block is compressed and transmitted in 1-bit format with a scaling factor, leading to a nearly 32x reduction on communication. Experimental results show that the proposed method converges as fast as full-precision distributed momentum SGD and achieves the same testing accuracy. In particular, on distributed ResNet training with 7 workers on the ImageNet, the proposed algorithm achieves the same testing accuracy as momentum SGD using full-precision gradients, but with $46%$ less wall clock time.
Tasks	Quantization
Published	2019-05-27
URL	https://arxiv.org/abs/1905.10936v2
PDF	https://arxiv.org/pdf/1905.10936v2.pdf
PWC	https://paperswithcode.com/paper/communication-efficient-distributed-blockwise
Repo
Framework

Generative Adversarial Networks and Conditional Random Fields for Hyperspectral Image Classification


Title	Generative Adversarial Networks and Conditional Random Fields for Hyperspectral Image Classification
Authors	Zilong Zhong, Jonathan Li, David A. Clausi, Alexander Wong
Abstract	In this paper, we address the hyperspectral image (HSI) classification task with a generative adversarial network and conditional random field (GAN-CRF) -based framework, which integrates a semi-supervised deep learning and a probabilistic graphical model, and make three contributions. First, we design four types of convolutional and transposed convolutional layers that consider the characteristics of HSIs to help with extracting discriminative features from limited numbers of labeled HSI samples. Second, we construct semi-supervised GANs to alleviate the shortage of training samples by adding labels to them and implicitly reconstructing real HSI data distribution through adversarial training. Third, we build dense conditional random fields (CRFs) on top of the random variables that are initialized to the softmax predictions of the trained GANs and are conditioned on HSIs to refine classification maps. This semi-supervised framework leverages the merits of discriminative and generative models through a game-theoretical approach. Moreover, even though we used very small numbers of labeled training HSI samples from the two most challenging and extensively studied datasets, the experimental results demonstrated that spectral-spatial GAN-CRF (SS-GAN-CRF) models achieved top-ranking accuracy for semi-supervised HSI classification.
Tasks	Hyperspectral Image Classification, Image Classification
Published	2019-05-12
URL	https://arxiv.org/abs/1905.04621v1
PDF	https://arxiv.org/pdf/1905.04621v1.pdf
PWC	https://paperswithcode.com/paper/generative-adversarial-networks-and-3
Repo
Framework

Locality and Structure Regularized Low Rank Representation for Hyperspectral Image Classification


Title	Locality and Structure Regularized Low Rank Representation for Hyperspectral Image Classification
Authors	Qi Wang, Xiange He, Xuelong Li
Abstract	Hyperspectral image (HSI) classification, which aims to assign an accurate label for hyperspectral pixels, has drawn great interest in recent years. Although low rank representation (LRR) has been used to classify HSI, its ability to segment each class from the whole HSI data has not been exploited fully yet. LRR has a good capacity to capture the underlying lowdimensional subspaces embedded in original data. However, there are still two drawbacks for LRR. First, LRR does not consider the local geometric structure within data, which makes the local correlation among neighboring data easily ignored. Second, the representation obtained by solving LRR is not discriminative enough to separate different data. In this paper, a novel locality and structure regularized low rank representation (LSLRR) model is proposed for HSI classification. To overcome the above limitations, we present locality constraint criterion (LCC) and structure preserving strategy (SPS) to improve the classical LRR. Specifically, we introduce a new distance metric, which combines both spatial and spectral features, to explore the local similarity of pixels. Thus, the global and local structures of HSI data can be exploited sufficiently. Besides, we propose a structure constraint to make the representation have a near block-diagonal structure. This helps to determine the final classification labels directly. Extensive experiments have been conducted on three popular HSI datasets. And the experimental results demonstrate that the proposed LSLRR outperforms other state-of-the-art methods.
Tasks	Hyperspectral Image Classification, Image Classification
Published	2019-05-07
URL	https://arxiv.org/abs/1905.02488v1
PDF	https://arxiv.org/pdf/1905.02488v1.pdf
PWC	https://paperswithcode.com/paper/locality-and-structure-regularized-low-rank
Repo
Framework

Generative Flow via Invertible nxn Convolution


Title	Generative Flow via Invertible nxn Convolution
Authors	Thanh-Dat Truong, Khoa Luu, Chi Nhan Duong, Ngan Le, Minh-Triet Tran
Abstract	Flow-based generative models have recently become one of the most efficient approaches to model the data generation. Indeed, they are constructed with a sequence of invertible and tractable transformations. Glow first introduced a simple type of generative flow using an invertible 1x1 convolution. However, the 1x1 convolution suffers from limited flexibility compared to the standard convolutions. In this paper, we propose a novel invertible nxn convolution approach that overcomes the limitations of the invertible 1x1 convolution. In addition, our proposed network is not only tractable and invertible but also uses fewer parameters than standard convolutions. The experiments on CIFAR-10, ImageNet, and Celeb-HQ datasets, have showed that our invertible nxn convolution helps to improve the performance of generative models significantly.
Tasks
Published	2019-05-24
URL	https://arxiv.org/abs/1905.10170v1
PDF	https://arxiv.org/pdf/1905.10170v1.pdf
PWC	https://paperswithcode.com/paper/generative-flow-via-invertible-nxn
Repo
Framework

Linked Dynamic Graph CNN: Learning on Point Cloud via Linking Hierarchical Features


Title	Linked Dynamic Graph CNN: Learning on Point Cloud via Linking Hierarchical Features
Authors	Kuangen Zhang, Ming Hao, Jing Wang, Clarence W. de Silva, Chenglong Fu
Abstract	Learning on point cloud is eagerly in demand because the point cloud is a common type of geometric data and can aid robots to understand environments robustly. However, the point cloud is sparse, unstructured, and unordered, which cannot be recognized accurately by a traditional convolutional neural network (CNN) nor a recurrent neural network (RNN). Fortunately, a graph convolutional neural network (Graph CNN) can process sparse and unordered data. Hence, we propose a linked dynamic graph CNN (LDGCNN) to classify and segment point cloud directly in this paper. We remove the transformation network, link hierarchical features from dynamic graphs, freeze feature extractor, and retrain the classifier to increase the performance of LDGCNN. We explain our network using theoretical analysis and visualization. Through experiments, we show that the proposed LDGCNN achieves state-of-art performance on two standard datasets: ModelNet40 and ShapeNet.
Tasks
Published	2019-04-22
URL	https://arxiv.org/abs/1904.10014v2
PDF	https://arxiv.org/pdf/1904.10014v2.pdf
PWC	https://paperswithcode.com/paper/linked-dynamic-graph-cnn-learning-on-point
Repo
Framework

Neural Network Generalization: The impact of camera parameters


Title	Neural Network Generalization: The impact of camera parameters
Authors	Zhenyi Liu, Trisha Lian, Joyce Farrell, Brian Wandell
Abstract	We quantify the generalization of a convolutional neural network (CNN) trained to identify cars. First, we perform a series of experiments to train the network using one image dataset - either synthetic or from a camera - and then test on a different image dataset. We show that generalization between images obtained with different cameras is roughly the same as generalization between images from a camera and ray-traced multispectral synthetic images. Second, we use ISETAuto, a soft prototyping tool that creates ray-traced multispectral simulations of camera images, to simulate sensor images with a range of pixel sizes, color filters, acquisition and post-acquisition processing. These experiments reveal how variations in specific camera parameters and image processing operations impact CNN generalization. We find that (a) pixel size impacts generalization, (b) demosaicking substantially impacts performance and generalization for shallow (8-bit) bit-depths but not deeper ones (10-bit), and (c) the network performs well using raw (not demosaicked) sensor data for 10-bit pixels.
Tasks	Demosaicking
Published	2019-12-08
URL	https://arxiv.org/abs/1912.03604v1
PDF	https://arxiv.org/pdf/1912.03604v1.pdf
PWC	https://paperswithcode.com/paper/neural-network-generalization-the-impact-of
Repo
Framework

3D Reconstruction of Deformable Revolving Object under Heavy Hand Interaction


Title	3D Reconstruction of Deformable Revolving Object under Heavy Hand Interaction
Authors	Raoul de Charette, Sotiris Manitsaris
Abstract	We reconstruct 3D deformable object through time, in the context of a live pottery making process where the crafter molds the object. Because the object suffers from heavy hand interaction, and is being deformed, classical techniques cannot be applied. We use particle energy optimization to estimate the object profile and benefit of the object radial symmetry to increase the robustness of the reconstruction to both occlusion and noise. Our method works with an unconstrained scalable setup with one or more depth sensors. We evaluate on our database (released upon publication) on a per-frame and temporal basis and shows it significantly outperforms state-of-the-art achieving 7.60mm average object reconstruction error. Further ablation studies demonstrate the effectiveness of our method.
Tasks	3D Reconstruction, Object Reconstruction
Published	2019-08-05
URL	https://arxiv.org/abs/1908.01523v1
PDF	https://arxiv.org/pdf/1908.01523v1.pdf
PWC	https://paperswithcode.com/paper/3d-reconstruction-of-deformable-revolving
Repo
Framework

Deep Neural Network Based Hyperspectral Pixel Classification With Factorized Spectral-Spatial Feature Representation


Title	Deep Neural Network Based Hyperspectral Pixel Classification With Factorized Spectral-Spatial Feature Representation
Authors	Jingzhou Chen, Siyu Chen, Peilin Zhou, Yuntao Qian
Abstract	Deep learning has been widely used for hyperspectral pixel classification due to its ability of generating deep feature representation. However, how to construct an efficient and powerful network suitable for hyperspectral data is still under exploration. In this paper, a novel neural network model is designed for taking full advantage of the spectral-spatial structure of hyperspectral data. Firstly, we extract pixel-based intrinsic features from rich yet redundant spectral bands by a subnetwork with supervised pre-training scheme. Secondly, in order to utilize the local spatial correlation among pixels, we share the previous subnetwork as a spectral feature extractor for each pixel in a patch of image, after which the spectral features of all pixels in a patch are combined and feeded into the subsequent classification subnetwork. Finally, the whole network is further fine-tuned to improve its classification performance. Specially, the spectral-spatial factorization scheme is applied in our model architecture, making the network size and the number of parameters great less than the existing spectral-spatial deep networks for hyperspectral image classification. Experiments on the hyperspectral data sets show that, compared with some state-of-art deep learning methods, our method achieves better classification results while having smaller network size and less parameters.
Tasks	Hyperspectral Image Classification, Image Classification
Published	2019-04-16
URL	http://arxiv.org/abs/1904.07461v1
PDF	http://arxiv.org/pdf/1904.07461v1.pdf
PWC	https://paperswithcode.com/paper/deep-neural-network-based-hyperspectral-pixel
Repo
Framework