January 30, 2020

3255 words 16 mins read

Paper Group ANR 422

Paper Group ANR 422

Superpixel Soup: Monocular Dense 3D Reconstruction of a Complex Dynamic Scene. Supervised Transfer Learning for Product Information Question Answering. Deep Residual Autoencoder for quality independent JPEG restoration. Normal Approximation for Stochastic Gradient Descent via Non-Asymptotic Rates of Martingale CLT. Cluster Analysis of High-Dimensio …

Superpixel Soup: Monocular Dense 3D Reconstruction of a Complex Dynamic Scene

Title Superpixel Soup: Monocular Dense 3D Reconstruction of a Complex Dynamic Scene
Authors Suryansh Kumar, Yuchao Dai, Hongdong Li
Abstract This work addresses the task of dense 3D reconstruction of a complex dynamic scene from images. The prevailing idea to solve this task is composed of a sequence of steps and is dependent on the success of several pipelines in its execution. To overcome such limitations with the existing algorithm, we propose a unified approach to solve this problem. We assume that a dynamic scene can be approximated by numerous piecewise planar surfaces, where each planar surface enjoys its own rigid motion, and the global change in the scene between two frames is as-rigid-as-possible (ARAP). Consequently, our model of a dynamic scene reduces to a soup of planar structures and rigid motion of these local planar structures. Using planar over-segmentation of the scene, we reduce this task to solving a “3D jigsaw puzzle” problem. Hence, the task boils down to correctly assemble each rigid piece to construct a 3D shape that complies with the geometry of the scene under the ARAP assumption. Further, we show that our approach provides an effective solution to the inherent scale-ambiguity in structure-from-motion under perspective projection. We provide extensive experimental results and evaluation on several benchmark datasets. Quantitative comparison with competing approaches shows state-of-the-art performance.
Tasks 3D Reconstruction
Published 2019-11-19
URL https://arxiv.org/abs/1911.09092v1
PDF https://arxiv.org/pdf/1911.09092v1.pdf
PWC https://paperswithcode.com/paper/superpixel-soup-monocular-dense-3d
Repo
Framework

Supervised Transfer Learning for Product Information Question Answering

Title Supervised Transfer Learning for Product Information Question Answering
Authors Tuan Manh Lai, Trung Bui, Nedim Lipka, Sheng Li
Abstract Popular e-commerce websites such as Amazon offer community question answering systems for users to pose product related questions and experienced customers may provide answers voluntarily. In this paper, we show that the large volume of existing community question answering data can be beneficial when building a system for answering questions related to product facts and specifications. Our experimental results demonstrate that the performance of a model for answering questions related to products listed in the Home Depot website can be improved by a large margin via a simple transfer learning technique from an existing large-scale Amazon community question answering dataset. Transfer learning can result in an increase of about 10% in accuracy in the experimental setting where we restrict the size of the data of the target task used for training. As an application of this work, we integrate the best performing model trained in this work into a mobile-based shopping assistant and show its usefulness.
Tasks Community Question Answering, Question Answering, Transfer Learning
Published 2019-01-08
URL http://arxiv.org/abs/1901.02539v1
PDF http://arxiv.org/pdf/1901.02539v1.pdf
PWC https://paperswithcode.com/paper/supervised-transfer-learning-for-product
Repo
Framework

Deep Residual Autoencoder for quality independent JPEG restoration

Title Deep Residual Autoencoder for quality independent JPEG restoration
Authors Simone Zini, Simone Bianco, Raimondo Schettini
Abstract In this paper we propose a deep residual autoencoder exploiting Residual-in-Residual Dense Blocks (RRDB) to remove artifacts in JPEG compressed images that is independent from the Quality Factor (QF) used. The proposed approach leverages both the learning capacity of deep residual networks and prior knowledge of the JPEG compression pipeline. The proposed model operates in the YCbCr color space and performs JPEG artifact restoration in two phases using two different autoencoders: the first one restores the luma channel exploiting 2D convolutions; the second one, using the restored luma channel as a guide, restores the chroma channels explotining 3D convolutions. Extensive experimental results on three widely used benchmark datasets (i.e. LIVE1, BDS500, and CLASSIC-5) show that our model is able to outperform the state of the art with respect to all the evaluation metrics considered (i.e. PSNR, PSNR-B, and SSIM). This results is remarkable since the approaches in the state of the art use a different set of weights for each compression quality, while the proposed model uses the same weights for all of them, making it applicable to images in the wild where the QF used for compression is unkwnown. Furthermore, the proposed model shows a greater robustness than state-of-the-art methods when applied to compression qualities not seen during training.
Tasks
Published 2019-03-14
URL http://arxiv.org/abs/1903.06117v1
PDF http://arxiv.org/pdf/1903.06117v1.pdf
PWC https://paperswithcode.com/paper/deep-residual-autoencoder-for-quality
Repo
Framework

Normal Approximation for Stochastic Gradient Descent via Non-Asymptotic Rates of Martingale CLT

Title Normal Approximation for Stochastic Gradient Descent via Non-Asymptotic Rates of Martingale CLT
Authors Andreas Anastasiou, Krishnakumar Balasubramanian, Murat A. Erdogdu
Abstract We provide non-asymptotic convergence rates of the Polyak-Ruppert averaged stochastic gradient descent (SGD) to a normal random vector for a class of twice-differentiable test functions. A crucial intermediate step is proving a non-asymptotic martingale central limit theorem (CLT), i.e., establishing the rates of convergence of a multivariate martingale difference sequence to a normal random vector, which might be of independent interest. We obtain the explicit rates for the multivariate martingale CLT using a combination of Stein’s method and Lindeberg’s argument, which is then used in conjunction with a non-asymptotic analysis of averaged SGD proposed in [PJ92]. Our results have potentially interesting consequences for computing confidence intervals for parameter estimation with SGD and constructing hypothesis tests with SGD that are valid in a non-asymptotic sense.
Tasks
Published 2019-04-03
URL http://arxiv.org/abs/1904.02130v1
PDF http://arxiv.org/pdf/1904.02130v1.pdf
PWC https://paperswithcode.com/paper/normal-approximation-for-stochastic-gradient
Repo
Framework

Cluster Analysis of High-Dimensional scRNA Sequencing Data

Title Cluster Analysis of High-Dimensional scRNA Sequencing Data
Authors Jiawei Long, Yu Xia
Abstract With ongoing developments and innovations in single-cell RNA sequencing methods, advancements in sequencing performance could empower significant discoveries as well as new emerging possibilities to address biological and medical investigations. In the study, we will be using the dataset collected by the authors of Systematic comparative analysis of single cell RNA-sequencing methods. The dataset consists of single-cell and single nucleus profiling from three types of samples - cell lines, peripheral blood mononuclear cells, and brain tissue, which offers 36 libraries in six separate experiments in a single center. Our quantitative comparison aims to identify unique characteristics associated with different single-cell sequencing methods, especially among low-throughput sequencing methods and high-throughput sequencing methods. Our procedures also incorporate evaluations of every method’s capacity for recovering known biological information in the samples through clustering analysis.
Tasks
Published 2019-12-18
URL https://arxiv.org/abs/1912.08400v1
PDF https://arxiv.org/pdf/1912.08400v1.pdf
PWC https://paperswithcode.com/paper/cluster-analysis-of-high-dimensional-scrna
Repo
Framework

Responsive Planning and Recognition for Closed-Loop Interaction

Title Responsive Planning and Recognition for Closed-Loop Interaction
Authors Richard G. Freedman, Yi Ren Fung, Roman Ganchin, Shlomo Zilberstein
Abstract Many intelligent systems currently interact with others using at least one of fixed communication inputs or preset responses, resulting in rigid interaction experiences and extensive efforts developing a variety of scenarios for the system. Fixed inputs limit the natural behavior of the user in order to effectively communicate, and preset responses prevent the system from adapting to the current situation unless it was specifically implemented. Closed-loop interaction instead focuses on dynamic responses that account for what the user is currently doing based on interpretations of their perceived activity. Agents employing closed-loop interaction can also monitor their interactions to ensure that the user responds as expected. We introduce a closed-loop interactive agent framework that integrates planning and recognition to predict what the user is trying to accomplish and autonomously decide on actions to take in response to these predictions. Based on a recent demonstration of such an assistive interactive agent in a turn-based simulated game, we also discuss new research challenges that are not present in the areas of artificial intelligence planning or recognition alone.
Tasks
Published 2019-09-13
URL https://arxiv.org/abs/1909.06427v1
PDF https://arxiv.org/pdf/1909.06427v1.pdf
PWC https://paperswithcode.com/paper/responsive-planning-and-recognition-for
Repo
Framework

Derivative-Free Global Optimization Algorithms: Bayesian Method and Lipschitzian Approaches

Title Derivative-Free Global Optimization Algorithms: Bayesian Method and Lipschitzian Approaches
Authors Jiawei Zhang
Abstract In this paper, we will provide an introduction to the derivative-free optimization algorithms which can be potentially applied to train deep learning models. Existing deep learning model training is mostly based on the back propagation algorithm, which updates the model variables layers by layers with the gradient descent algorithm or its variants. However, the objective functions of deep learning models to be optimized are usually non-convex and the gradient descent algorithms based on the first-order derivative can get stuck into the local optima very easily. To resolve such a problem, various local or global optimization algorithms have been proposed, which can help improve the training of deep learning models greatly. The representative examples include the Bayesian methods, Shubert-Piyavskii algorithm, Direct, LIPO, MCS, GA, SCE, DE, PSO, ES, CMA-ES, hill climbing and simulated annealing, etc. One part of these algorithms will be introduced in this paper (including the Bayesian method and Lipschitzian approaches, e.g., Shubert-Piyavskii algorithm, Direct, LIPO and MCS), and the remaining algorithms (including the population based optimization algorithms, e.g., GA, SCE, DE, PSO, ES and CMA-ES, and random search algorithms, e.g., hill climbing and simulated annealing) will be introduced in the follow-up paper [18] in detail.
Tasks
Published 2019-04-19
URL http://arxiv.org/abs/1904.09365v1
PDF http://arxiv.org/pdf/1904.09365v1.pdf
PWC https://paperswithcode.com/paper/190409365
Repo
Framework

SynSig2Vec: Learning Representations from Synthetic Dynamic Signatures for Real-world Verification

Title SynSig2Vec: Learning Representations from Synthetic Dynamic Signatures for Real-world Verification
Authors Songxuan Lai, Lianwen Jin, Luojun Lin, Yecheng Zhu, Huiyun Mao
Abstract An open research problem in automatic signature verification is the skilled forgery attacks. However, the skilled forgeries are very difficult to acquire for representation learning. To tackle this issue, this paper proposes to learn dynamic signature representations through ranking synthesized signatures. First, a neuromotor inspired signature synthesis method is proposed to synthesize signatures with different distortion levels for any template signature. Then, given the templates, we construct a lightweight one-dimensional convolutional network to learn to rank the synthesized samples, and directly optimize the average precision of the ranking to exploit relative and fine-grained signature similarities. Finally, after training, fixed-length representations can be extracted from dynamic signatures of variable lengths for verification. One highlight of our method is that it requires neither skilled nor random forgeries for training, yet it surpasses the state-of-the-art by a large margin on two public benchmarks.
Tasks Representation Learning
Published 2019-11-13
URL https://arxiv.org/abs/1911.05358v2
PDF https://arxiv.org/pdf/1911.05358v2.pdf
PWC https://paperswithcode.com/paper/synsig2vec-learning-representations-from
Repo
Framework

HR-CAM: Precise Localization of Pathology Using Multi-level Learning in CNNs

Title HR-CAM: Precise Localization of Pathology Using Multi-level Learning in CNNs
Authors Sumeet Shinde, Tanay Chougule, Jitender Saini, Madhura Ingalhalikar
Abstract We propose a CNN based technique that aggregates feature maps from its multiple layers that can localize abnormalities with greater details as well as predict pathology under consideration. Existing class activation mapping (CAM) techniques extract feature maps from either the final layer or a single intermediate layer to create the discriminative maps and then interpolate to upsample to the original image resolution. In this case, the subject specific localization is coarse and is unable to capture subtle abnormalities. To mitigate this, our method builds a novel CNN based discriminative localization model that we call high resolution CAM (HR-CAM), which accounts for layers from each resolution, therefore facilitating a comprehensive map that can delineate the pathology for each subject by combining low-level, intermediate as well as high-level features from the CNN. Moreover, our model directly provides the discriminative map in the resolution of the original image facilitating finer delineation of abnormalities. We demonstrate the working of our model on a simulated abnormalities data where we illustrate how the model captures finer details in the final discriminative maps as compared to current techniques. We then apply this technique: (1) to classify ependymomas from grade IV glioblastoma on T1-weighted contrast enhanced (T1-CE) MRI and (2) to predict Parkinson’s disease from neuromelanin sensitive MRI. In all these cases we demonstrate that our model not only predicts pathologies with high accuracies, but also creates clinically interpretable subject specific high resolution discriminative localizations. Overall, the technique can be generalized to any CNN and carries high relevance in a clinical setting.
Tasks
Published 2019-09-23
URL https://arxiv.org/abs/1909.12919v1
PDF https://arxiv.org/pdf/1909.12919v1.pdf
PWC https://paperswithcode.com/paper/hr-cam-precise-localization-of-pathology
Repo
Framework

Low Bit-Rate Speech Coding with VQ-VAE and a WaveNet Decoder

Title Low Bit-Rate Speech Coding with VQ-VAE and a WaveNet Decoder
Authors Cristina Gârbacea, Aäron van den Oord, Yazhe Li, Felicia S C Lim, Alejandro Luebs, Oriol Vinyals, Thomas C Walters
Abstract In order to efficiently transmit and store speech signals, speech codecs create a minimally redundant representation of the input signal which is then decoded at the receiver with the best possible perceptual quality. In this work we demonstrate that a neural network architecture based on VQ-VAE with a WaveNet decoder can be used to perform very low bit-rate speech coding with high reconstruction quality. A prosody-transparent and speaker-independent model trained on the LibriSpeech corpus coding audio at 1.6 kbps exhibits perceptual quality which is around halfway between the MELP codec at 2.4 kbps and AMR-WB codec at 23.05 kbps. In addition, when training on high-quality recorded speech with the test speaker included in the training set, a model coding speech at 1.6 kbps produces output of similar perceptual quality to that generated by AMR-WB at 23.05 kbps.
Tasks
Published 2019-10-14
URL https://arxiv.org/abs/1910.06464v1
PDF https://arxiv.org/pdf/1910.06464v1.pdf
PWC https://paperswithcode.com/paper/low-bit-rate-speech-coding-with-vq-vae-and-a
Repo
Framework

Image Manipulation with Natural Language using Two-sidedAttentive Conditional Generative Adversarial Network

Title Image Manipulation with Natural Language using Two-sidedAttentive Conditional Generative Adversarial Network
Authors Dawei Zhu, Aditya Mogadala, Dietrich Klakow
Abstract Altering the content of an image with photo editing tools is a tedious task for an inexperienced user. Especially, when modifying the visual attributes of a specific object in an image without affecting other constituents such as background etc. To simplify the process of image manipulation and to provide more control to users, it is better to utilize a simpler interface like natural language. Therefore, in this paper, we address the challenge of manipulating images using natural language description. We propose the Two-sidEd Attentive conditional Generative Adversarial Network (TEA-cGAN) to generate semantically manipulated images while preserving other contents such as background intact. TEA-cGAN uses fine-grained attention both in the generator and discriminator of Generative Adversarial Network (GAN) based framework at different scales. Experimental results show that TEA-cGAN which generates 128x128 and 256x256 resolution images outperforms existing methods on CUB and Oxford-102 datasets both quantitatively and qualitatively.
Tasks
Published 2019-12-16
URL https://arxiv.org/abs/1912.07478v1
PDF https://arxiv.org/pdf/1912.07478v1.pdf
PWC https://paperswithcode.com/paper/image-manipulation-with-natural-language
Repo
Framework

Conditional Computation for Continual Learning

Title Conditional Computation for Continual Learning
Authors Min Lin, Jie Fu, Yoshua Bengio
Abstract Catastrophic forgetting of connectionist neural networks is caused by the global sharing of parameters among all training examples. In this study, we analyze parameter sharing under the conditional computation framework where the parameters of a neural network are conditioned on each input example. At one extreme, if each input example uses a disjoint set of parameters, there is no sharing of parameters thus no catastrophic forgetting. At the other extreme, if the parameters are the same for every example, it reduces to the conventional neural network. We then introduce a clipped version of maxout networks which lies in the middle, i.e. parameters are shared partially among examples. Based on the parameter sharing analysis, we can locate a limited set of examples that are interfered when learning a new example. We propose to perform rehearsal on this set to prevent forgetting, which is termed as conditional rehearsal. Finally, we demonstrate the effectiveness of the proposed method in an online non-stationary setup, where updates are made after each new example and the distribution of the received example shifts over time.
Tasks Continual Learning
Published 2019-06-16
URL https://arxiv.org/abs/1906.06635v1
PDF https://arxiv.org/pdf/1906.06635v1.pdf
PWC https://paperswithcode.com/paper/conditional-computation-for-continual
Repo
Framework

FairVis: Visual Analytics for Discovering Intersectional Bias in Machine Learning

Title FairVis: Visual Analytics for Discovering Intersectional Bias in Machine Learning
Authors Ángel Alexander Cabrera, Will Epperson, Fred Hohman, Minsuk Kahng, Jamie Morgenstern, Duen Horng Chau
Abstract The growing capability and accessibility of machine learning has led to its application to many real-world domains and data about people. Despite the benefits algorithmic systems may bring, models can reflect, inject, or exacerbate implicit and explicit societal biases into their outputs, disadvantaging certain demographic subgroups. Discovering which biases a machine learning model has introduced is a great challenge, due to the numerous definitions of fairness and the large number of potentially impacted subgroups. We present FairVis, a mixed-initiative visual analytics system that integrates a novel subgroup discovery technique for users to audit the fairness of machine learning models. Through FairVis, users can apply domain knowledge to generate and investigate known subgroups, and explore suggested and similar subgroups. FairVis’ coordinated views enable users to explore a high-level overview of subgroup performance and subsequently drill down into detailed investigation of specific subgroups. We show how FairVis helps to discover biases in two real datasets used in predicting income and recidivism. As a visual analytics system devoted to discovering bias in machine learning, FairVis demonstrates how interactive visualization may help data scientists and the general public understand and create more equitable algorithmic systems.
Tasks
Published 2019-04-10
URL https://arxiv.org/abs/1904.05419v4
PDF https://arxiv.org/pdf/1904.05419v4.pdf
PWC https://paperswithcode.com/paper/fairvis-visual-analytics-for-discovering
Repo
Framework

Study of Constrained Network Structures for WGANs on Numeric Data Generation

Title Study of Constrained Network Structures for WGANs on Numeric Data Generation
Authors Wei Wang, Chuang Wang, Tao Cui, Yue Li
Abstract Some recent studies have suggested using GANs for numeric data generation such as to generate data for completing the imbalanced numeric data. Considering the significant difference between the dimensions of the numeric data and images, as well as the strong correlations between features of numeric data, the conventional GANs normally face an overfitting problem, consequently leads to an ill-conditioning problem in generating numeric and structured data. This paper studies the constrained network structures between generator G and discriminator D in WGAN, designs several structures including isomorphic, mirror and self-symmetric structures. We evaluates the performances of the constrained WGANs in data augmentations, taking the non-constrained GANs and WGANs as the baselines. Experiments prove the constrained structures have been improved in 17/20 groups of experiments. In twenty experiments on four UCI Machine Learning Repository datasets, Australian Credit Approval data, German Credit data, Pima Indians Diabetes data and SPECT heart data facing five conventional classifiers. Especially, Isomorphic WGAN is the best in 15/20 experiments. Finally, we theoretically proves that the effectiveness of constrained structures by the directed graphic model (DGM) analysis.
Tasks
Published 2019-11-05
URL https://arxiv.org/abs/1911.01649v1
PDF https://arxiv.org/pdf/1911.01649v1.pdf
PWC https://paperswithcode.com/paper/study-of-constrained-network-structures-for
Repo
Framework

SCAN: A Scalable Neural Networks Framework Towards Compact and Efficient Models

Title SCAN: A Scalable Neural Networks Framework Towards Compact and Efficient Models
Authors Linfeng Zhang, Zhanhong Tan, Jiebo Song, Jingwei Chen, Chenglong Bao, Kaisheng Ma
Abstract Remarkable achievements have been attained by deep neural networks in various applications. However, the increasing depth and width of such models also lead to explosive growth in both storage and computation, which has restricted the deployment of deep neural networks on resource-limited edge devices. To address this problem, we propose the so-called SCAN framework for networks training and inference, which is orthogonal and complementary to existing acceleration and compression methods. The proposed SCAN firstly divides neural networks into multiple sections according to their depth and constructs shallow classifiers upon the intermediate features of different sections. Moreover, attention modules and knowledge distillation are utilized to enhance the accuracy of shallow classifiers. Based on this architecture, we further propose a threshold controlled scalable inference mechanism to approach human-like sample-specific inference. Experimental results show that SCAN can be easily equipped on various neural networks without any adjustment on hyper-parameters or neural networks architectures, yielding significant performance gain on CIFAR100 and ImageNet. Codes will be released on github soon.
Tasks
Published 2019-05-27
URL https://arxiv.org/abs/1906.03951v1
PDF https://arxiv.org/pdf/1906.03951v1.pdf
PWC https://paperswithcode.com/paper/scan-a-scalable-neural-networks-framework
Repo
Framework
comments powered by Disqus