February 1, 2020

3253 words 16 mins read

Paper Group AWR 123

Cardinality Estimation in a Virtualized Network Device Using Online Machine Learning. Stochastic Gradient Methods with Layer-wise Adaptive Moments for Training of Deep Networks. Effect Inference from Two-Group Data with Sampling Bias. Convolution Based Spectral Partitioning Architecture for Hyperspectral Image Classification. Regression and Classif …

Cardinality Estimation in a Virtualized Network Device Using Online Machine Learning


Title	Cardinality Estimation in a Virtualized Network Device Using Online Machine Learning
Authors	Reuven Cohen, Yuval Nezri
Abstract	Cardinality estimation algorithms receive a stream of elements, with possible repetitions, and return the number of distinct elements in the stream. Such algorithms seek to minimize the required memory and CPU resource consumption at the price of inaccuracy in their output. In computer networks, cardinality estimation algorithms are mainly used for counting the number of distinct flows, and they are divided into two categories: sketching algorithms and sampling algorithms. Sketching algorithms require the processing of all packets, and they are therefore usually implemented by dedicated hardware. Sampling algorithms do not require processing of all packets, but they are known for their inaccuracy. In this work we identify one of the major drawbacks of sampling-based cardinality estimation algorithms: their inability to adapt to changes in flow size distribution. To address this problem, we propose a new sampling-based adaptive cardinality estimation framework, which uses online machine learning. We evaluate our framework using real traffic traces, and show significantly better accuracy compared to the best known sampling-based algorithms, for the same fraction of processed packets.
Tasks
Published	2019-03-13
URL	http://arxiv.org/abs/1903.05728v1
PDF	http://arxiv.org/pdf/1903.05728v1.pdf
PWC	https://paperswithcode.com/paper/cardinality-estimation-in-a-virtualized
Repo	https://github.com/yuvalnezri/CardEst
Framework	none

Stochastic Gradient Methods with Layer-wise Adaptive Moments for Training of Deep Networks


Title	Stochastic Gradient Methods with Layer-wise Adaptive Moments for Training of Deep Networks
Authors	Boris Ginsburg, Patrice Castonguay, Oleksii Hrinchuk, Oleksii Kuchaiev, Vitaly Lavrukhin, Ryan Leary, Jason Li, Huyen Nguyen, Yang Zhang, Jonathan M. Cohen
Abstract	We propose NovoGrad, an adaptive stochastic gradient descent method with layer-wise gradient normalization and decoupled weight decay. In our experiments on neural networks for image classification, speech recognition, machine translation, and language modeling, it performs on par or better than well tuned SGD with momentum and Adam or AdamW. Additionally, NovoGrad (1) is robust to the choice of learning rate and weight initialization, (2) works well in a large batch setting, and (3) has two times smaller memory footprint than Adam.
Tasks	Stochastic Optimization
Published	2019-05-27
URL	https://arxiv.org/abs/1905.11286v3
PDF	https://arxiv.org/pdf/1905.11286v3.pdf
PWC	https://paperswithcode.com/paper/stochastic-gradient-methods-with-layer-wise
Repo	https://github.com/convergence-lab/novograd
Framework	pytorch

Effect Inference from Two-Group Data with Sampling Bias


Title	Effect Inference from Two-Group Data with Sampling Bias
Authors	Dave Zachariah, Petre Stoica
Abstract	In many applications, different populations are compared using data that are sampled in a biased manner. Under sampling biases, standard methods that estimate the difference between the population means yield unreliable inferences. Here we develop an inference method that is resilient to sampling biases and is able to control the false positive errors under moderate bias levels in contrast to the standard approach. We demonstrate the method using synthetic and real biomarker data.
Tasks
Published	2019-02-26
URL	https://arxiv.org/abs/1902.09923v2
PDF	https://arxiv.org/pdf/1902.09923v2.pdf
PWC	https://paperswithcode.com/paper/effect-inference-from-two-group-data-with
Repo	https://github.com/dzachariah/two-groups-data
Framework	none

Convolution Based Spectral Partitioning Architecture for Hyperspectral Image Classification


Title	Convolution Based Spectral Partitioning Architecture for Hyperspectral Image Classification
Authors	Ringo S. W. Chu, Ho-Cheung Ng, Xiwei Wang, Wayne Luk
Abstract	Hyperspectral images (HSIs) can distinguish materials with high number of spectral bands, which is widely adopted in remote sensing applications and benefits in high accuracy land cover classifications. However, HSIs processing are tangled with the problem of high dimensionality and limited amount of labelled data. To address these challenges, this paper proposes a deep learning architecture using three dimensional convolutional neural networks with spectral partitioning to perform effective feature extraction. We conduct experiments using Indian Pines and Salinas scenes acquired by NASA Airborne Visible/Infra-Red Imaging Spectrometer. In comparison to prior results, our architecture shows competitive performance for classification results over current methods.
Tasks	Hyperspectral Image Classification, Image Classification
Published	2019-06-27
URL	https://arxiv.org/abs/1906.11981v1
PDF	https://arxiv.org/pdf/1906.11981v1.pdf
PWC	https://paperswithcode.com/paper/convolution-based-spectral-partitioning
Repo	https://github.com/custom-computing-ic/SpecPatConv3D-Network
Framework	tf

Regression and Classification for Direction-of-Arrival Estimation with Convolutional Recurrent Neural Networks


Title	Regression and Classification for Direction-of-Arrival Estimation with Convolutional Recurrent Neural Networks
Authors	Zhenyu Tang, John D. Kanu, Kevin Hogan, Dinesh Manocha
Abstract	We present a novel learning-based approach to estimate the direction-of-arrival (DOA) of a sound source using a convolutional recurrent neural network (CRNN) trained via regression on synthetic data and Cartesian labels. We also describe an improved method to generate synthetic data to train the neural network using state-of-the-art sound propagation algorithms that model specular as well as diffuse reflections of sound. We compare our model against three other CRNNs trained using different formulations of the same problem: classification on categorical labels, and regression on spherical coordinate labels. In practice, our model achieves up to 43% decrease in angular error over prior methods. The use of diffuse reflection results in 34% and 41% reduction in angular prediction errors on LOCATA and SOFA datasets, respectively, over prior methods based on image-source methods. Our method results in an additional 3% error reduction over prior schemes that use classification based networks, and we use 36% fewer network parameters.
Tasks	Direction of Arrival Estimation
Published	2019-04-17
URL	https://arxiv.org/abs/1904.08452v3
PDF	https://arxiv.org/pdf/1904.08452v3.pdf
PWC	https://paperswithcode.com/paper/regression-and-classification-for-direction
Repo	https://github.com/RoyJames/doa-release
Framework	tf

Transductive Zero-Shot Learning for 3D Point Cloud Classification


Title	Transductive Zero-Shot Learning for 3D Point Cloud Classification
Authors	Ali Cheraghian, Shafin Rahman, Dylan Campbell, Lars Petersson
Abstract	Zero-shot learning, the task of learning to recognize new classes not seen during training, has received considerable attention in the case of 2D image classification. However despite the increasing ubiquity of 3D sensors, the corresponding 3D point cloud classification problem has not been meaningfully explored and introduces new challenges. This paper extends, for the first time, transductive Zero-Shot Learning (ZSL) and Generalized Zero-Shot Learning (GZSL) approaches to the domain of 3D point cloud classification. To this end, a novel triplet loss is developed that takes advantage of unlabeled test data. While designed for the task of 3D point cloud classification, the method is also shown to be applicable to the more common use-case of 2D image classification. An extensive set of experiments is carried out, establishing state-of-the-art for ZSL and GZSL in the 3D point cloud domain, as well as demonstrating the applicability of the approach to the image domain.
Tasks	Image Classification, Zero-Shot Learning
Published	2019-12-16
URL	https://arxiv.org/abs/1912.07161v2
PDF	https://arxiv.org/pdf/1912.07161v2.pdf
PWC	https://paperswithcode.com/paper/transductive-zero-shot-learning-for-3d-point
Repo	https://github.com/ali-chr/Transductive_ZSL_3D_Point_Cloud
Framework	none

NAS evaluation is frustratingly hard


Title	NAS evaluation is frustratingly hard
Authors	Antoine Yang, Pedro M. Esperança, Fabio M. Carlucci
Abstract	Neural Architecture Search (NAS) is an exciting new field which promises to be as much as a game-changer as Convolutional Neural Networks were in 2012. Despite many great works leading to substantial improvements on a variety of tasks, comparison between different methods is still very much an open issue. While most algorithms are tested on the same datasets, there is no shared experimental protocol followed by all. As such, and due to the under-use of ablation studies, there is a lack of clarity regarding why certain methods are more effective than others. Our first contribution is a benchmark of $8$ NAS methods on $5$ datasets. To overcome the hurdle of comparing methods with different search spaces, we propose using a method’s relative improvement over the randomly sampled average architecture, which effectively removes advantages arising from expertly engineered search spaces or training protocols. Surprisingly, we find that many NAS techniques struggle to significantly beat the average architecture baseline. We perform further experiments with the commonly used DARTS search space in order to understand the contribution of each component in the NAS pipeline. These experiments highlight that: (i) the use of tricks in the evaluation protocol has a predominant impact on the reported performance of architectures; (ii) the cell-based search space has a very narrow accuracy range, such that the seed has a considerable impact on architecture rankings; (iii) the hand-designed macro-structure (cells) is more important than the searched micro-structure (operations); and (iv) the depth-gap is a real phenomenon, evidenced by the change in rankings between $8$ and $20$ cell architectures. To conclude, we suggest best practices, that we hope will prove useful for the community and help mitigate current NAS pitfalls. The code used is available at https://github.com/antoyang/NAS-Benchmark.
Tasks	Neural Architecture Search
Published	2019-12-28
URL	https://arxiv.org/abs/1912.12522v3
PDF	https://arxiv.org/pdf/1912.12522v3.pdf
PWC	https://paperswithcode.com/paper/nas-evaluation-is-frustratingly-hard-1
Repo	https://github.com/antoyang/NAS-Benchmark
Framework	pytorch

NAS-FCOS: Fast Neural Architecture Search for Object Detection


Title	NAS-FCOS: Fast Neural Architecture Search for Object Detection
Authors	Ning Wang, Yang Gao, Hao Chen, Peng Wang, Zhi Tian, Chunhua Shen, Yanning Zhang
Abstract	The success of deep neural networks relies on significant architecture engineering. Recently neural architecture search (NAS) has emerged as a promise to greatly reduce manual effort in network design by automatically searching for optimal architectures, although typically such algorithms need an excessive amount of computational resources, e.g., a few thousand GPU-days. To date, on challenging vision tasks such as object detection, NAS, especially fast versions of NAS, is less studied. Here we propose to search for the decoder structure of object detectors with search efficiency being taken into consideration. To be more specific, we aim to efficiently search for the feature pyramid network (FPN) as well as the prediction head of a simple anchor-free object detector, namely FCOS, using a tailored reinforcement learning paradigm. With carefully designed search space, search algorithms and strategies for evaluating network quality, we are able to efficiently search a top-performing detection architecture within 4 days using 8 V100 GPUs. The discovered architecture surpasses state-of-the-art object detection models (such as Faster R-CNN, RetinaNet and FCOS) by 1.5 to 3.5 points in AP on the COCO dataset, with comparable computation complexity and memory footprint, demonstrating the efficacy of the proposed NAS for object detection.
Tasks	Neural Architecture Search, Object Detection
Published	2019-06-11
URL	https://arxiv.org/abs/1906.04423v4
PDF	https://arxiv.org/pdf/1906.04423v4.pdf
PWC	https://paperswithcode.com/paper/nas-fcos-fast-neural-architecture-search-for
Repo	https://github.com/Lausannen/NAS-FCOS
Framework	pytorch

On Minimum Discrepancy Estimation for Deep Domain Adaptation


Title	On Minimum Discrepancy Estimation for Deep Domain Adaptation
Authors	Mohammad Mahfujur Rahman, Clinton Fookes, Mahsa Baktashmotlagh, Sridha Sridharan
Abstract	In the presence of large sets of labeled data, Deep Learning (DL) has accomplished extraordinary triumphs in the avenue of computer vision, particularly in object classification and recognition tasks. However, DL cannot always perform well when the training and testing images come from different distributions or in the presence of domain shift between training and testing images. They also suffer in the absence of labeled input data. Domain adaptation (DA) methods have been proposed to make up the poor performance due to domain shift. In this paper, we present a new unsupervised deep domain adaptation method based on the alignment of second order statistics (covariances) as well as maximum mean discrepancy of the source and target data with a two stream Convolutional Neural Network (CNN). We demonstrate the ability of the proposed approach to achieve state-of the-art performance for image classification on three benchmark domain adaptation datasets: Office-31 [27], Office-Home [37] and Office-Caltech [8].
Tasks	Domain Adaptation, Image Classification, Object Classification
Published	2019-01-02
URL	http://arxiv.org/abs/1901.00282v1
PDF	http://arxiv.org/pdf/1901.00282v1.pdf
PWC	https://paperswithcode.com/paper/on-minimum-discrepancy-estimation-for-deep
Repo	https://github.com/mahfujur1/MDE-DDA
Framework	none

Dual-Stream Pyramid Registration Network


Title	Dual-Stream Pyramid Registration Network
Authors	Xiaojun Hu, Miao Kang, Weilin Huang, Matthew R. Scott, Roland Wiest, Mauricio Reyes
Abstract	We propose a Dual-Stream Pyramid Registration Network (referred as Dual-PRNet) for unsupervised 3D medical image registration. Unlike recent CNN-based registration approaches, such as VoxelMorph, which explores a single-stream encoder-decoder network to compute a registration fields from a pair of 3D volumes, we design a two-stream architecture able to compute multi-scale registration fields from convolutional feature pyramids. Our contributions are two-fold: (i) we design a two-stream 3D encoder-decoder network which computes two convolutional feature pyramids separately for a pair of input volumes, resulting in strong deep representations that are meaningful for deformation estimation; (ii) we propose a pyramid registration module able to predict multi-scale registration fields directly from the decoding feature pyramids. This allows it to refine the registration fields gradually in a coarse-to-fine manner via sequential warping, and enable the model with the capability for handling significant deformations between two volumes, such as large displacements in spatial domain or slice space. The proposed Dual-PRNet is evaluated on two standard benchmarks for brain MRI registration, where it outperforms the state-of-the-art approaches by a large margin, e.g., having improvements over recent VoxelMorph [2] with 0.683->0.778 on the LPBA40, and 0.511->0.631 on the Mindboggle101, in term of average Dice score.
Tasks	Image Registration, Medical Image Registration
Published	2019-09-26
URL	https://arxiv.org/abs/1909.11966v1
PDF	https://arxiv.org/pdf/1909.11966v1.pdf
PWC	https://paperswithcode.com/paper/dual-stream-pyramid-registration-network
Repo	https://github.com/Duoduo-Qian/Medical-image-registration-Resources
Framework	pytorch

SMILES-X: autonomous molecular compounds characterization for small datasets without descriptors


Title	SMILES-X: autonomous molecular compounds characterization for small datasets without descriptors
Authors	Guillaume Lambard, Ekaterina Gracheva
Abstract	There is more and more evidence that machine learning can be successfully applied in materials science and related fields. However, datasets in these fields are often quite small ($\ll1000$ samples). It makes the most advanced machine learning techniques remain neglected, as they are considered to be applicable to big data only. Moreover, materials informatics methods often rely on human-engineered descriptors, that should be carefully chosen, or even created, to fit the physicochemical property that one intends to predict. In this article, we propose a new method that tackles both the issue of small datasets and the difficulty of task-specific descriptors development. The SMILES-X is an autonomous pipeline for molecular compounds characterisation based on a {Embed-Encode-Attend-Predict} neural architecture with a data-specific Bayesian hyper-parameters optimisation. The only input to the architecture – the SMILES strings – are de-canonicalised in order to efficiently augment the data. One of the key features of the architecture is the attention mechanism, which enables the interpretation of output predictions without extra computational cost. The SMILES-X shows new state-of-the-art results in the inference of aqueous solubility ($\overline{RMSE}{test} \simeq 0.57 \pm 0.07$ mols/L), hydration free energy ($\overline{RMSE}{test} \simeq 0.81 \pm 0.22$ kcal/mol, which is $\sim 24.5%$ better than molecular dynamics simulations), and octanol/water distribution coefficient ($\overline{RMSE}_{test} \simeq 0.59 \pm 0.02$ for LogD at pH 7.4) of molecular compounds. The SMILES-X is intended to become an important asset in the toolkit of materials scientists and chemists. The source code for the SMILES-X is available at \href{https://github.com/GLambard/SMILES-X}{github.com/GLambard/SMILES-X}.
Tasks
Published	2019-06-20
URL	https://arxiv.org/abs/1906.09938v2
PDF	https://arxiv.org/pdf/1906.09938v2.pdf
PWC	https://paperswithcode.com/paper/smiles-x-autonomous-molecular-compounds
Repo	https://github.com/GLambard/SMILES-X
Framework	tf

Closing the Gap between Deep and Conventional Image Registration using Probabilistic Dense Displacement Networks


Title	Closing the Gap between Deep and Conventional Image Registration using Probabilistic Dense Displacement Networks
Authors	Mattias P. Heinrich
Abstract	Nonlinear image registration continues to be a fundamentally important tool in medical image analysis. Diagnostic tasks, image-guided surgery and radiotherapy as well as motion analysis all rely heavily on accurate intra-patient alignment. Furthermore, inter-patient registration enables atlas-based segmentation or landmark localisation and shape analysis. When labelled scans are scarce and anatomical differences large, conventional registration has often remained superior to deep learning methods that have so far mainly dealt with relatively small or low-complexity deformations. We address this shortcoming by leveraging ideas from probabilistic dense displacement optimisation that has excelled in many registration tasks with large deformations. We propose to design a network with approximate min-convolutions and mean field inference for differentiable displacement regularisation within a discrete weakly-supervised registration setting. By employing these meaningful and theoretically proven constraints, our learnable registration algorithm contains very few trainable weights (primarily for feature extraction) and is easier to train with few labelled scans. It is very fast in training and inference and achieves state-of-the-art accuracies for the challenging inter-patient registration of abdominal CT outperforming previous deep learning approaches by 15% Dice overlap.
Tasks	Image Registration
Published	2019-07-25
URL	https://arxiv.org/abs/1907.10931v1
PDF	https://arxiv.org/pdf/1907.10931v1.pdf
PWC	https://paperswithcode.com/paper/closing-the-gap-between-deep-and-conventional
Repo	https://github.com/multimodallearning/pdd_net
Framework	pytorch

Convex-Concave Backtracking for Inertial Bregman Proximal Gradient Algorithms in Non-Convex Optimization


Title	Convex-Concave Backtracking for Inertial Bregman Proximal Gradient Algorithms in Non-Convex Optimization
Authors	Mahesh Chandra Mukkamala, Peter Ochs, Thomas Pock, Shoham Sabach
Abstract	Backtracking line-search is an old yet powerful strategy for finding a better step sizes to be used in proximal gradient algorithms. The main principle is to locally find a simple convex upper bound of the objective function, which in turn controls the step size that is used. In case of inertial proximal gradient algorithms, the situation becomes much more difficult and usually leads to very restrictive rules on the extrapolation parameter. In this paper, we show that the extrapolation parameter can be controlled by locally finding also a simple concave lower bound of the objective function. This gives rise to a double convex-concave backtracking procedure which allows for an adaptive choice of both the step size and extrapolation parameters. We apply this procedure to the class of inertial Bregman proximal gradient methods, and prove that any sequence generated by these algorithms converges globally to a critical point of the function at hand. Numerical experiments on a number of challenging non-convex problems in image processing and machine learning were conducted and show the power of combining inertial step and double backtracking strategy in achieving improved performances.
Tasks
Published	2019-04-06
URL	https://arxiv.org/abs/1904.03537v2
PDF	https://arxiv.org/pdf/1904.03537v2.pdf
PWC	https://paperswithcode.com/paper/convex-concave-backtracking-for-inertial
Repo	https://github.com/mmahesh/cocain-bpg-phase-retrieval
Framework	none

Next-Active-Object prediction from Egocentric Videos


Title	Next-Active-Object prediction from Egocentric Videos
Authors	Antonino Furnari, Sebastiano Battiato, Kristen Grauman, Giovanni Maria Farinella
Abstract	Although First Person Vision systems can sense the environment from the user’s perspective, they are generally unable to predict his intentions and goals. Since human activities can be decomposed in terms of atomic actions and interactions with objects, intelligent wearable systems would benefit from the ability to anticipate user-object interactions. Even if this task is not trivial, the First Person Vision paradigm can provide important cues to address this challenge. We propose to exploit the dynamics of the scene to recognize next-active-objects before an object interaction begins. We train a classifier to discriminate trajectories leading to an object activation from all others and forecast next-active-objects by analyzing fixed-length trajectory segments within a temporal sliding window. The proposed method compares favorably with respect to several baselines on the Activity of Daily Living (ADL) egocentric dataset comprising 10 hours of videos acquired by 20 subjects while performing unconstrained interactions with several objects.
Tasks
Published	2019-04-10
URL	http://arxiv.org/abs/1904.05250v1
PDF	http://arxiv.org/pdf/1904.05250v1.pdf
PWC	https://paperswithcode.com/paper/next-active-object-prediction-from-egocentric
Repo	https://github.com/antoninofurnari/rulstm
Framework	pytorch

One Shot Learning for Deformable Medical Image Registration and Periodic Motion Tracking


Title	One Shot Learning for Deformable Medical Image Registration and Periodic Motion Tracking
Authors	Tobias Fechter, Dimos Baltas
Abstract	Deformable image registration is a very important field of research in medical imaging. Recently multiple deep learning approaches were published in this area showing promising results. However, drawbacks of deep learning methods are the need for a large amount of training datasets and their inability to register unseen images different from the training datasets. One shot learning comes without the need of large training datasets and has already been proven to be applicable to 3D data. In this work we present a one shot registration approach for periodic motion tracking in 3D and 4D datasets. When applied to 3D dataset the algorithm calculates the inverse of a registration vector field simultaneously. For registration we employed a U-Net combined with a coarse to fine approach and a differential spatial transformer module. The algorithm was thoroughly tested with multiple 4D and 3D datasets publicly available. The results show that the presented approach is able to track periodic motion and to yield a competitive registration accuracy. Possible applications are the use as a stand-alone algorithm for 3D and 4D motion tracking or in the beginning of studies until enough datasets for a separate training phase are available.
Tasks	Deformable Medical Image Registration, Image Registration, Medical Image Registration, One-Shot Learning
Published	2019-07-10
URL	https://arxiv.org/abs/1907.04641v3
PDF	https://arxiv.org/pdf/1907.04641v3.pdf
PWC	https://paperswithcode.com/paper/one-shot-learning-for-deformable-medical
Repo	https://github.com/ToFec/OneShotImageRegistration
Framework	pytorch