January 31, 2020

3185 words 15 mins read

Paper Group ANR 32

Paper Group ANR 32

Scale Matters: Temporal Scale Aggregation Network for Precise Action Localization in Untrimmed Videos. Superpixel-based Color Transfer. Patent Claim Generation by Fine-Tuning OpenAI GPT-2. PD-ML-Lite: Private Distributed Machine Learning from Lighweight Cryptography. Adaptive Neuro Particle Swarm Optimization applied for diagnosing disorders. Kerne …

Scale Matters: Temporal Scale Aggregation Network for Precise Action Localization in Untrimmed Videos

Title Scale Matters: Temporal Scale Aggregation Network for Precise Action Localization in Untrimmed Videos
Authors Guoqiang Gong, Liangfeng Zheng, Kun Bai, Yadong Mu
Abstract Temporal action localization is a recently-emerging task, aiming to localize video segments from untrimmed videos that contain specific actions. Despite the remarkable recent progress, most two-stage action localization methods still suffer from imprecise temporal boundaries of action proposals. This work proposes a novel integrated temporal scale aggregation network (TSA-Net). Our main insight is that ensembling convolution filters with different dilation rates can effectively enlarge the receptive field with low computational cost, which inspires us to devise multi-dilation temporal convolution (MDC) block. Furthermore, to tackle video action instances with different durations, TSA-Net consists of multiple branches of sub-networks. Each of them adopts stacked MDC blocks with different dilation parameters, accomplishing a temporal receptive field specially optimized for specific-duration actions. We follow the formulation of boundary point detection, novelly detecting three kinds of critical points (ie, starting / mid-point / ending) and pairing them for proposal generation. Comprehensive evaluations are conducted on two challenging video benchmarks, THUMOS14 and ActivityNet-1.3. Our proposed TSA-Net demonstrates clear and consistent better performances and re-calibrates new state-of-the-art on both benchmarks. For example, our new record on THUMOS14 is 46.9% while the previous best is 42.8% under mAP@0.5.
Tasks Action Localization, Temporal Action Localization
Published 2019-08-02
URL https://arxiv.org/abs/1908.00707v1
PDF https://arxiv.org/pdf/1908.00707v1.pdf
PWC https://paperswithcode.com/paper/scale-matters-temporal-scale-aggregation
Repo
Framework

Superpixel-based Color Transfer

Title Superpixel-based Color Transfer
Authors Rémi Giraud, Vinh-Thong Ta, Nicolas Papadakis
Abstract In this work, we propose a fast superpixel-based color transfer method (SCT) between two images. Superpixels enable to decrease the image dimension and to extract a reduced set of color candidates. We propose to use a fast approximate nearest neighbor matching algorithm in which we enforce the match diversity by limiting the selection of the same superpixels. A fusion framework is designed to transfer the matched colors, and we demonstrate the improvement obtained over exact matching results. Finally, we show that SCT is visually competitive compared to state-of-the-art methods.
Tasks
Published 2019-03-14
URL http://arxiv.org/abs/1903.06010v1
PDF http://arxiv.org/pdf/1903.06010v1.pdf
PWC https://paperswithcode.com/paper/superpixel-based-color-transfer
Repo
Framework

Patent Claim Generation by Fine-Tuning OpenAI GPT-2

Title Patent Claim Generation by Fine-Tuning OpenAI GPT-2
Authors Jieh-Sheng Lee, Jieh Hsiang
Abstract In this work, we focus on fine-tuning an OpenAI GPT-2 pre-trained model for generating patent claims. GPT-2 has demonstrated impressive efficacy of pre-trained language models on various tasks, particularly coherent text generation. Patent claim language itself has rarely been explored in the past and poses a unique challenge. We are motivated to generate coherent patent claims automatically so that augmented inventing might be viable someday. In our implementation, we identified a unique language structure in patent claims and leveraged its implicit human annotations. We investigated the fine-tuning process by probing the first 100 steps and observing the generated text at each step. Based on both conditional and unconditional random sampling, we analyze the overall quality of generated patent claims. Our contributions include: (1) being the first to generate patent claims by machines and being the first to apply GPT-2 to patent claim generation, (2) providing various experiment results for qualitative analysis and future research, (3) proposing a new sampling approach for text generation, and (4) building an e-mail bot for future researchers to explore the fine-tuned GPT-2 model further.
Tasks Text Generation
Published 2019-07-01
URL https://arxiv.org/abs/1907.02052v1
PDF https://arxiv.org/pdf/1907.02052v1.pdf
PWC https://paperswithcode.com/paper/patent-claim-generation-by-fine-tuning-openai
Repo
Framework

PD-ML-Lite: Private Distributed Machine Learning from Lighweight Cryptography

Title PD-ML-Lite: Private Distributed Machine Learning from Lighweight Cryptography
Authors Maksim Tsikhanovich, Malik Magdon-Ismail, Muhammad Ishaq, Vassilis Zikas
Abstract Privacy is a major issue in learning from distributed data. Recently the cryptographic literature has provided several tools for this task. However, these tools either reduce the quality/accuracy of the learning algorithm—e.g., by adding noise—or they incur a high performance penalty and/or involve trusting external authorities. We propose a methodology for {\sl private distributed machine learning from light-weight cryptography} (in short, PD-ML-Lite). We apply our methodology to two major ML algorithms, namely non-negative matrix factorization (NMF) and singular value decomposition (SVD). Our resulting protocols are communication optimal, achieve the same accuracy as their non-private counterparts, and satisfy a notion of privacy—which we define—that is both intuitive and measurable. Our approach is to use lightweight cryptographic protocols (secure sum and normalized secure sum) to build learning algorithms rather than wrap complex learning algorithms in a heavy-cost MPC framework. We showcase our algorithms’ utility and privacy on several applications: for NMF we consider topic modeling and recommender systems, and for SVD, principal component regression, and low rank approximation.
Tasks Recommendation Systems
Published 2019-01-23
URL http://arxiv.org/abs/1901.07986v2
PDF http://arxiv.org/pdf/1901.07986v2.pdf
PWC https://paperswithcode.com/paper/pd-ml-lite-private-distributed-machine
Repo
Framework

Adaptive Neuro Particle Swarm Optimization applied for diagnosing disorders

Title Adaptive Neuro Particle Swarm Optimization applied for diagnosing disorders
Authors Majid Masoumi, Mina Rajabi
Abstract A new Adaptive Neuro Particle Swarm Optimization (ANPSO) combined with a fuzzy inference system for diagnosing disorders is presented in this paper. The main contributions of the novel proposed method can be a global search across the whole search space with faster convergence rate. Moreover, it shows a better exploration and exploitation by applying the adaptive control parameters, automatic control of inertia weight and coefficient of personal and social behaviours. Utilizing such attributes lead to a fast and smart diagnosis mechanism which is able to diagnosis the diseases by the high accuracy. The ANPSO is associated with tuning the characteristics of the inference system to achieve the minimum diagnosis error as far as the optimized model is obtained. As a case study, we use liver disorders dataset called Bupa. According to the preliminary ramifications, the suggested adaptive PSO performance can overcome the traditional inference system and combined with other optimization methods substantially.
Tasks
Published 2019-10-14
URL https://arxiv.org/abs/1910.14021v1
PDF https://arxiv.org/pdf/1910.14021v1.pdf
PWC https://paperswithcode.com/paper/adaptive-neuro-particle-swarm-optimization
Repo
Framework

Kernel Change-point Detection with Auxiliary Deep Generative Models

Title Kernel Change-point Detection with Auxiliary Deep Generative Models
Authors Wei-Cheng Chang, Chun-Liang Li, Yiming Yang, Barnabás Póczos
Abstract Detecting the emergence of abrupt property changes in time series is a challenging problem. Kernel two-sample test has been studied for this task which makes fewer assumptions on the distributions than traditional parametric approaches. However, selecting kernels is non-trivial in practice. Although kernel selection for two-sample test has been studied, the insufficient samples in change point detection problem hinder the success of those developed kernel selection algorithms. In this paper, we propose KL-CPD, a novel kernel learning framework for time series CPD that optimizes a lower bound of test power via an auxiliary generative model. With deep kernel parameterization, KL-CPD endows kernel two-sample test with the data-driven kernel to detect different types of change-points in real-world applications. The proposed approach significantly outperformed other state-of-the-art methods in our comparative evaluation of benchmark datasets and simulation studies.
Tasks Change Point Detection, Time Series
Published 2019-01-18
URL http://arxiv.org/abs/1901.06077v1
PDF http://arxiv.org/pdf/1901.06077v1.pdf
PWC https://paperswithcode.com/paper/kernel-change-point-detection-with-auxiliary
Repo
Framework

Kinematic State Abstraction and Provably Efficient Rich-Observation Reinforcement Learning

Title Kinematic State Abstraction and Provably Efficient Rich-Observation Reinforcement Learning
Authors Dipendra Misra, Mikael Henaff, Akshay Krishnamurthy, John Langford
Abstract We present an algorithm, HOMER, for exploration and reinforcement learning in rich observation environments that are summarizable by an unknown latent state space. The algorithm interleaves representation learning to identify a new notion of kinematic state abstraction with strategic exploration to reach new states using the learned abstraction. The algorithm provably explores the environment with sample complexity scaling polynomially in the number of latent states and the time horizon, and, crucially, with no dependence on the size of the observation space, which could be infinitely large. This exploration guarantee further enables sample-efficient global policy optimization for any reward function. On the computational side, we show that the algorithm can be implemented efficiently whenever certain supervised learning problems are tractable. Empirically, we evaluate HOMER on a challenging exploration problem, where we show that the algorithm is exponentially more sample efficient than standard reinforcement learning baselines.
Tasks Representation Learning
Published 2019-11-13
URL https://arxiv.org/abs/1911.05815v1
PDF https://arxiv.org/pdf/1911.05815v1.pdf
PWC https://paperswithcode.com/paper/kinematic-state-abstraction-and-provably
Repo
Framework

SteReFo: Efficient Image Refocusing with Stereo Vision

Title SteReFo: Efficient Image Refocusing with Stereo Vision
Authors Benjamin Busam, Matthieu Hog, Steven McDonagh, Gregory Slabaugh
Abstract Whether to attract viewer attention to a particular object, give the impression of depth or simply reproduce human-like scene perception, shallow depth of field images are used extensively by professional and amateur photographers alike. To this end, high quality optical systems are used in DSLR cameras to focus on a specific depth plane while producing visually pleasing bokeh. We propose a physically motivated pipeline to mimic this effect from all-in-focus stereo images, typically retrieved by mobile cameras. It is capable to change the focal plane a posteriori at 76 FPS on KITTI images to enable real-time applications. As our portmanteau suggests, SteReFo interrelates stereo-based depth estimation and refocusing efficiently. In contrast to other approaches, our pipeline is simultaneously fully differentiable, physically motivated, and agnostic to scene content. It also enables computational video focus tracking for moving objects in addition to refocusing of static images. We evaluate our approach on the publicly available datasets SceneFlow, KITTI, CityScapes and quantify the quality of architectural changes.
Tasks Depth Estimation
Published 2019-09-29
URL https://arxiv.org/abs/1909.13395v1
PDF https://arxiv.org/pdf/1909.13395v1.pdf
PWC https://paperswithcode.com/paper/sterefo-efficient-image-refocusing-with
Repo
Framework

Unified Attentional Generative Adversarial Network for Brain Tumor Segmentation From Multimodal Unpaired Images

Title Unified Attentional Generative Adversarial Network for Brain Tumor Segmentation From Multimodal Unpaired Images
Authors Wenguang Yuan, Jia Wei, Jiabing Wang, Qianli Ma, Tolga Tasdizen
Abstract In medical applications, the same anatomical structures may be observed in multiple modalities despite the different image characteristics. Currently, most deep models for multimodal segmentation rely on paired registered images. However, multimodal paired registered images are difficult to obtain in many cases. Therefore, developing a model that can segment the target objects from different modalities with unpaired images is significant for many clinical applications. In this work, we propose a novel two-stream translation and segmentation unified attentional generative adversarial network (UAGAN), which can perform any-to-any image modality translation and segment the target objects simultaneously in the case where two or more modalities are available. The translation stream is used to capture modality-invariant features of the target anatomical structures. In addition, to focus on segmentation-related features, we add attentional blocks to extract valuable features from the translation stream. Experiments on three-modality brain tumor segmentation indicate that UAGAN outperforms the existing methods in most cases.
Tasks Brain Tumor Segmentation
Published 2019-07-08
URL https://arxiv.org/abs/1907.03548v1
PDF https://arxiv.org/pdf/1907.03548v1.pdf
PWC https://paperswithcode.com/paper/unified-attentional-generative-adversarial
Repo
Framework

Efficient Surface-Aware Semi-Global Matching with Multi-View Plane-Sweep Sampling

Title Efficient Surface-Aware Semi-Global Matching with Multi-View Plane-Sweep Sampling
Authors Boitumelo Ruf, Thomas Pollok, Martin Weinmann
Abstract Online augmentation of an oblique aerial image sequence with structural information is an essential aspect in the process of 3D scene interpretation and analysis. One key aspect in this is the efficient dense image matching and depth estimation. Here, the Semi-Global Matching (SGM) approach has proven to be one of the most widely used algorithms for efficient depth estimation, providing a good trade-off between accuracy and computational complexity. However, SGM only models a first-order smoothness assumption, thus favoring fronto-parallel surfaces. In this work, we present a hierarchical algorithm that allows for efficient depth and normal map estimation together with confidence measures for each estimate. Our algorithm relies on a plane-sweep multi-image matching followed by an extended SGM optimization that allows to incorporate local surface orientations, thus achieving more consistent and accurate estimates in areasmade up of slanted surfaces, inherent to oblique aerial imagery. We evaluate numerous configurations of our algorithm on two different datasets using an absolute and relative accuracy measure. In our evaluation, we show that the results of our approach are comparable to the ones achieved by refined Structure-from-Motion (SfM) pipelines, such as COLMAP, which are designed for offline processing. In contrast, however, our approach only considers a confined image bundle of an input sequence, thus allowing to perform an online and incremental computation at 1Hz-2Hz.
Tasks Depth Estimation
Published 2019-09-21
URL https://arxiv.org/abs/1909.09891v1
PDF https://arxiv.org/pdf/1909.09891v1.pdf
PWC https://paperswithcode.com/paper/190909891
Repo
Framework

Generative approach to unsupervised deep local learning

Title Generative approach to unsupervised deep local learning
Authors Changlu Chen, Chaoxi Niu, Xia Zhan, Kun Zhan
Abstract Most existing feature learning methods optimize inflexible handcrafted features and the affinity matrix is constructed by shallow linear embedding methods. Different from these conventional methods, we pretrain a generative neural network by stacking convolutional autoencoders to learn the latent data representation and then construct an affinity graph with them as a prior. Based on the pretrained model and the constructed graph, we add a self-expressive layer to complete the generative model and then fine-tune it with a new loss function, including the reconstruction loss and a deliberately defined locality-preserving loss. The locality-preserving loss designed by the constructed affinity graph serves as prior to preserve the local structure during the fine-tuning stage, which in turn improves the quality of feature representation effectively. Furthermore, the self-expressive layer between the encoder and decoder is based on the assumption that each latent feature is a linear combination of other latent features, so the weighted combination coefficients of the self-expressive layer are used to construct a new refined affinity graph for representing the data structure. We conduct experiments on four datasets to demonstrate the superiority of the representation ability of our proposed model over the state-of-the-art methods.
Tasks
Published 2019-06-19
URL https://arxiv.org/abs/1906.07947v2
PDF https://arxiv.org/pdf/1906.07947v2.pdf
PWC https://paperswithcode.com/paper/a-generative-approach-to-unsupervised-deep
Repo
Framework

RES-SE-NET: Boosting Performance of Resnets by Enhancing Bridge-connections

Title RES-SE-NET: Boosting Performance of Resnets by Enhancing Bridge-connections
Authors Varshaneya V, Balasubramanian S, Darshan Gera
Abstract One of the ways to train deep neural networks effectively is to use residual connections. Residual connections can be classified as being either identity connections or bridge-connections with a reshaping convolution. Empirical observations on CIFAR-10 and CIFAR-100 datasets using a baseline Resnet model, with bridge-connections removed, have shown a significant reduction in accuracy. This reduction is due to lack of contribution, in the form of feature maps, by the bridge-connections. Hence bridge-connections are vital for Resnet. However, all feature maps in the bridge-connections are considered to be equally important. In this work, an upgraded architecture “Res-SE-Net” is proposed to further strengthen the contribution from the bridge-connections by quantifying the importance of each feature map and weighting them accordingly using Squeeze-and-Excitation (SE) block. It is demonstrated that Res-SE-Net generalizes much better than Resnet and SE-Resnet on the benchmark CIFAR-10 and CIFAR-100 datasets.
Tasks
Published 2019-02-16
URL http://arxiv.org/abs/1902.06066v1
PDF http://arxiv.org/pdf/1902.06066v1.pdf
PWC https://paperswithcode.com/paper/res-se-net-boosting-performance-of-resnets-by
Repo
Framework

Neural Network-Inspired Analog-to-Digital Conversion to Achieve Super-Resolution with Low-Precision RRAM Devices

Title Neural Network-Inspired Analog-to-Digital Conversion to Achieve Super-Resolution with Low-Precision RRAM Devices
Authors Weidong Cao, Liu Ke, Ayan Chakrabarti, Xuan Zhang
Abstract Recent works propose neural network- (NN-) inspired analog-to-digital converters (NNADCs) and demonstrate their great potentials in many emerging applications. These NNADCs often rely on resistive random-access memory (RRAM) devices to realize the NN operations and require high-precision RRAM cells (6~12-bit) to achieve a moderate quantization resolution (4~8-bit). Such optimistic assumption of RRAM resolution, however, is not supported by fabrication data of RRAM arrays in large-scale production process. In this paper, we propose an NN-inspired super-resolution ADC based on low-precision RRAM devices by taking the advantage of a co-design methodology that combines a pipelined hardware architecture with a custom NN training framework. Results obtained from SPICE simulations demonstrate that our method leads to robust design of a 14-bit super-resolution ADC using 3-bit RRAM devices with improved power and speed performance and competitive figure-of-merits (FoMs). In addition to the linear uniform quantization, the proposed ADC can also support configurable high-resolution nonlinear quantization with high conversion speed and low conversion energy, enabling future intelligent analog-to-information interfaces for near-sensor analytics and processing.
Tasks Quantization, Super-Resolution
Published 2019-11-28
URL https://arxiv.org/abs/1911.12815v1
PDF https://arxiv.org/pdf/1911.12815v1.pdf
PWC https://paperswithcode.com/paper/neural-network-inspired-analog-to-digital
Repo
Framework

Using Physics-Informed Super-Resolution Generative Adversarial Networks for Subgrid Modeling in Turbulent Reactive Flows

Title Using Physics-Informed Super-Resolution Generative Adversarial Networks for Subgrid Modeling in Turbulent Reactive Flows
Authors Mathis Bode, Michael Gauding, Zeyu Lian, Dominik Denker, Marco Davidovic, Konstantin Kleinheinz, Jenia Jitsev, Heinz Pitsch
Abstract Turbulence is still one of the main challenges for accurately predicting reactive flows. Therefore, the development of new turbulence closures which can be applied to combustion problems is essential. Data-driven modeling has become very popular in many fields over the last years as large, often extensively labeled, datasets became available and training of large neural networks became possible on GPUs speeding up the learning process tremendously. However, the successful application of deep neural networks in fluid dynamics, for example for subgrid modeling in the context of large-eddy simulations (LESs), is still challenging. Reasons for this are the large amount of degrees of freedom in realistic flows, the high requirements with respect to accuracy and error robustness, as well as open questions, such as the generalization capability of trained neural networks in such high-dimensional, physics-constrained scenarios. This work presents a novel subgrid modeling approach based on a generative adversarial network (GAN), which is trained with unsupervised deep learning (DL) using adversarial and physics-informed losses. A two-step training method is used to improve the generalization capability, especially extrapolation, of the network. The novel approach gives good results in a priori as well as a posteriori tests with decaying turbulence including turbulent mixing. The applicability of the network in complex combustion scenarios is furthermore discussed by employing it to a reactive LES of the Spray A case defined by the Engine Combustion Network (ECN).
Tasks Super-Resolution
Published 2019-11-26
URL https://arxiv.org/abs/1911.11380v1
PDF https://arxiv.org/pdf/1911.11380v1.pdf
PWC https://paperswithcode.com/paper/using-physics-informed-super-resolution
Repo
Framework

Functional Regularisation for Continual Learning with Gaussian Processes

Title Functional Regularisation for Continual Learning with Gaussian Processes
Authors Michalis K. Titsias, Jonathan Schwarz, Alexander G. de G. Matthews, Razvan Pascanu, Yee Whye Teh
Abstract We introduce a framework for Continual Learning (CL) based on Bayesian inference over the function space rather than the parameters of a deep neural network. This method, referred to as functional regularisation for Continual Learning, avoids forgetting a previous task by constructing and memorising an approximate posterior belief over the underlying task-specific function. To achieve this we rely on a Gaussian process obtained by treating the weights of the last layer of a neural network as random and Gaussian distributed. Then, the training algorithm sequentially encounters tasks and constructs posterior beliefs over the task-specific functions by using inducing point sparse Gaussian process methods. At each step a new task is first learnt and then a summary is constructed consisting of (i) inducing inputs – a fixed-size subset of the task inputs selected such that it optimally represents the task – and (ii) a posterior distribution over the function values at these inputs. This summary then regularises learning of future tasks, through Kullback-Leibler regularisation terms. Our method thus unites approaches focused on (pseudo-)rehearsal with those derived from a sequential Bayesian inference perspective in a principled way, leading to strong results on accepted benchmarks.
Tasks Bayesian Inference, Continual Learning, Gaussian Processes, Omniglot
Published 2019-01-31
URL https://arxiv.org/abs/1901.11356v4
PDF https://arxiv.org/pdf/1901.11356v4.pdf
PWC https://paperswithcode.com/paper/functional-regularisation-for-continual
Repo
Framework
comments powered by Disqus