January 31, 2020

3185 words 15 mins read

Paper Group ANR 32

Scale Matters: Temporal Scale Aggregation Network for Precise Action Localization in Untrimmed Videos. Superpixel-based Color Transfer. Patent Claim Generation by Fine-Tuning OpenAI GPT-2. PD-ML-Lite: Private Distributed Machine Learning from Lighweight Cryptography. Adaptive Neuro Particle Swarm Optimization applied for diagnosing disorders. Kerne …

Scale Matters: Temporal Scale Aggregation Network for Precise Action Localization in Untrimmed Videos


Title	Scale Matters: Temporal Scale Aggregation Network for Precise Action Localization in Untrimmed Videos
Authors	Guoqiang Gong, Liangfeng Zheng, Kun Bai, Yadong Mu
Abstract	Temporal action localization is a recently-emerging task, aiming to localize video segments from untrimmed videos that contain specific actions. Despite the remarkable recent progress, most two-stage action localization methods still suffer from imprecise temporal boundaries of action proposals. This work proposes a novel integrated temporal scale aggregation network (TSA-Net). Our main insight is that ensembling convolution filters with different dilation rates can effectively enlarge the receptive field with low computational cost, which inspires us to devise multi-dilation temporal convolution (MDC) block. Furthermore, to tackle video action instances with different durations, TSA-Net consists of multiple branches of sub-networks. Each of them adopts stacked MDC blocks with different dilation parameters, accomplishing a temporal receptive field specially optimized for specific-duration actions. We follow the formulation of boundary point detection, novelly detecting three kinds of critical points (ie, starting / mid-point / ending) and pairing them for proposal generation. Comprehensive evaluations are conducted on two challenging video benchmarks, THUMOS14 and ActivityNet-1.3. Our proposed TSA-Net demonstrates clear and consistent better performances and re-calibrates new state-of-the-art on both benchmarks. For example, our new record on THUMOS14 is 46.9% while the previous best is 42.8% under mAP@0.5.
Tasks	Action Localization, Temporal Action Localization
Published	2019-08-02
URL	https://arxiv.org/abs/1908.00707v1
PDF	https://arxiv.org/pdf/1908.00707v1.pdf
PWC	https://paperswithcode.com/paper/scale-matters-temporal-scale-aggregation
Repo
Framework

Superpixel-based Color Transfer


Title	Superpixel-based Color Transfer
Authors	Rémi Giraud, Vinh-Thong Ta, Nicolas Papadakis
Abstract	In this work, we propose a fast superpixel-based color transfer method (SCT) between two images. Superpixels enable to decrease the image dimension and to extract a reduced set of color candidates. We propose to use a fast approximate nearest neighbor matching algorithm in which we enforce the match diversity by limiting the selection of the same superpixels. A fusion framework is designed to transfer the matched colors, and we demonstrate the improvement obtained over exact matching results. Finally, we show that SCT is visually competitive compared to state-of-the-art methods.
Tasks
Published	2019-03-14
URL	http://arxiv.org/abs/1903.06010v1
PDF	http://arxiv.org/pdf/1903.06010v1.pdf
PWC	https://paperswithcode.com/paper/superpixel-based-color-transfer
Repo
Framework

Patent Claim Generation by Fine-Tuning OpenAI GPT-2


Title	Patent Claim Generation by Fine-Tuning OpenAI GPT-2
Authors	Jieh-Sheng Lee, Jieh Hsiang
Abstract	In this work, we focus on fine-tuning an OpenAI GPT-2 pre-trained model for generating patent claims. GPT-2 has demonstrated impressive efficacy of pre-trained language models on various tasks, particularly coherent text generation. Patent claim language itself has rarely been explored in the past and poses a unique challenge. We are motivated to generate coherent patent claims automatically so that augmented inventing might be viable someday. In our implementation, we identified a unique language structure in patent claims and leveraged its implicit human annotations. We investigated the fine-tuning process by probing the first 100 steps and observing the generated text at each step. Based on both conditional and unconditional random sampling, we analyze the overall quality of generated patent claims. Our contributions include: (1) being the first to generate patent claims by machines and being the first to apply GPT-2 to patent claim generation, (2) providing various experiment results for qualitative analysis and future research, (3) proposing a new sampling approach for text generation, and (4) building an e-mail bot for future researchers to explore the fine-tuned GPT-2 model further.
Tasks	Text Generation
Published	2019-07-01
URL	https://arxiv.org/abs/1907.02052v1
PDF	https://arxiv.org/pdf/1907.02052v1.pdf
PWC	https://paperswithcode.com/paper/patent-claim-generation-by-fine-tuning-openai
Repo
Framework

PD-ML-Lite: Private Distributed Machine Learning from Lighweight Cryptography


Title	PD-ML-Lite: Private Distributed Machine Learning from Lighweight Cryptography
Authors	Maksim Tsikhanovich, Malik Magdon-Ismail, Muhammad Ishaq, Vassilis Zikas
Abstract	Privacy is a major issue in learning from distributed data. Recently the cryptographic literature has provided several tools for this task. However, these tools either reduce the quality/accuracy of the learning algorithm—e.g., by adding noise—or they incur a high performance penalty and/or involve trusting external authorities. We propose a methodology for {\sl private distributed machine learning from light-weight cryptography} (in short, PD-ML-Lite). We apply our methodology to two major ML algorithms, namely non-negative matrix factorization (NMF) and singular value decomposition (SVD). Our resulting protocols are communication optimal, achieve the same accuracy as their non-private counterparts, and satisfy a notion of privacy—which we define—that is both intuitive and measurable. Our approach is to use lightweight cryptographic protocols (secure sum and normalized secure sum) to build learning algorithms rather than wrap complex learning algorithms in a heavy-cost MPC framework. We showcase our algorithms’ utility and privacy on several applications: for NMF we consider topic modeling and recommender systems, and for SVD, principal component regression, and low rank approximation.
Tasks	Recommendation Systems
Published	2019-01-23
URL	http://arxiv.org/abs/1901.07986v2
PDF	http://arxiv.org/pdf/1901.07986v2.pdf
PWC	https://paperswithcode.com/paper/pd-ml-lite-private-distributed-machine
Repo
Framework

Adaptive Neuro Particle Swarm Optimization applied for diagnosing disorders


Title	Adaptive Neuro Particle Swarm Optimization applied for diagnosing disorders
Authors	Majid Masoumi, Mina Rajabi
Abstract	A new Adaptive Neuro Particle Swarm Optimization (ANPSO) combined with a fuzzy inference system for diagnosing disorders is presented in this paper. The main contributions of the novel proposed method can be a global search across the whole search space with faster convergence rate. Moreover, it shows a better exploration and exploitation by applying the adaptive control parameters, automatic control of inertia weight and coefficient of personal and social behaviours. Utilizing such attributes lead to a fast and smart diagnosis mechanism which is able to diagnosis the diseases by the high accuracy. The ANPSO is associated with tuning the characteristics of the inference system to achieve the minimum diagnosis error as far as the optimized model is obtained. As a case study, we use liver disorders dataset called Bupa. According to the preliminary ramifications, the suggested adaptive PSO performance can overcome the traditional inference system and combined with other optimization methods substantially.
Tasks
Published	2019-10-14
URL	https://arxiv.org/abs/1910.14021v1
PDF	https://arxiv.org/pdf/1910.14021v1.pdf
PWC	https://paperswithcode.com/paper/adaptive-neuro-particle-swarm-optimization
Repo
Framework

Kernel Change-point Detection with Auxiliary Deep Generative Models


Title	Kernel Change-point Detection with Auxiliary Deep Generative Models
Authors	Wei-Cheng Chang, Chun-Liang Li, Yiming Yang, Barnabás Póczos
Abstract	Detecting the emergence of abrupt property changes in time series is a challenging problem. Kernel two-sample test has been studied for this task which makes fewer assumptions on the distributions than traditional parametric approaches. However, selecting kernels is non-trivial in practice. Although kernel selection for two-sample test has been studied, the insufficient samples in change point detection problem hinder the success of those developed kernel selection algorithms. In this paper, we propose KL-CPD, a novel kernel learning framework for time series CPD that optimizes a lower bound of test power via an auxiliary generative model. With deep kernel parameterization, KL-CPD endows kernel two-sample test with the data-driven kernel to detect different types of change-points in real-world applications. The proposed approach significantly outperformed other state-of-the-art methods in our comparative evaluation of benchmark datasets and simulation studies.
Tasks	Change Point Detection, Time Series
Published	2019-01-18
URL	http://arxiv.org/abs/1901.06077v1
PDF	http://arxiv.org/pdf/1901.06077v1.pdf
PWC	https://paperswithcode.com/paper/kernel-change-point-detection-with-auxiliary
Repo
Framework

Kinematic State Abstraction and Provably Efficient Rich-Observation Reinforcement Learning


Title	Kinematic State Abstraction and Provably Efficient Rich-Observation Reinforcement Learning
Authors	Dipendra Misra, Mikael Henaff, Akshay Krishnamurthy, John Langford
Abstract	We present an algorithm, HOMER, for exploration and reinforcement learning in rich observation environments that are summarizable by an unknown latent state space. The algorithm interleaves representation learning to identify a new notion of kinematic state abstraction with strategic exploration to reach new states using the learned abstraction. The algorithm provably explores the environment with sample complexity scaling polynomially in the number of latent states and the time horizon, and, crucially, with no dependence on the size of the observation space, which could be infinitely large. This exploration guarantee further enables sample-efficient global policy optimization for any reward function. On the computational side, we show that the algorithm can be implemented efficiently whenever certain supervised learning problems are tractable. Empirically, we evaluate HOMER on a challenging exploration problem, where we show that the algorithm is exponentially more sample efficient than standard reinforcement learning baselines.
Tasks	Representation Learning
Published	2019-11-13
URL	https://arxiv.org/abs/1911.05815v1
PDF	https://arxiv.org/pdf/1911.05815v1.pdf
PWC	https://paperswithcode.com/paper/kinematic-state-abstraction-and-provably
Repo
Framework

SteReFo: Efficient Image Refocusing with Stereo Vision


Title	SteReFo: Efficient Image Refocusing with Stereo Vision
Authors	Benjamin Busam, Matthieu Hog, Steven McDonagh, Gregory Slabaugh
Abstract	Whether to attract viewer attention to a particular object, give the impression of depth or simply reproduce human-like scene perception, shallow depth of field images are used extensively by professional and amateur photographers alike. To this end, high quality optical systems are used in DSLR cameras to focus on a specific depth plane while producing visually pleasing bokeh. We propose a physically motivated pipeline to mimic this effect from all-in-focus stereo images, typically retrieved by mobile cameras. It is capable to change the focal plane a posteriori at 76 FPS on KITTI images to enable real-time applications. As our portmanteau suggests, SteReFo interrelates stereo-based depth estimation and refocusing efficiently. In contrast to other approaches, our pipeline is simultaneously fully differentiable, physically motivated, and agnostic to scene content. It also enables computational video focus tracking for moving objects in addition to refocusing of static images. We evaluate our approach on the publicly available datasets SceneFlow, KITTI, CityScapes and quantify the quality of architectural changes.
Tasks	Depth Estimation
Published	2019-09-29
URL	https://arxiv.org/abs/1909.13395v1
PDF	https://arxiv.org/pdf/1909.13395v1.pdf
PWC	https://paperswithcode.com/paper/sterefo-efficient-image-refocusing-with
Repo
Framework

Unified Attentional Generative Adversarial Network for Brain Tumor Segmentation From Multimodal Unpaired Images


Title	Unified Attentional Generative Adversarial Network for Brain Tumor Segmentation From Multimodal Unpaired Images
Authors	Wenguang Yuan, Jia Wei, Jiabing Wang, Qianli Ma, Tolga Tasdizen
Abstract	In medical applications, the same anatomical structures may be observed in multiple modalities despite the different image characteristics. Currently, most deep models for multimodal segmentation rely on paired registered images. However, multimodal paired registered images are difficult to obtain in many cases. Therefore, developing a model that can segment the target objects from different modalities with unpaired images is significant for many clinical applications. In this work, we propose a novel two-stream translation and segmentation unified attentional generative adversarial network (UAGAN), which can perform any-to-any image modality translation and segment the target objects simultaneously in the case where two or more modalities are available. The translation stream is used to capture modality-invariant features of the target anatomical structures. In addition, to focus on segmentation-related features, we add attentional blocks to extract valuable features from the translation stream. Experiments on three-modality brain tumor segmentation indicate that UAGAN outperforms the existing methods in most cases.
Tasks	Brain Tumor Segmentation
Published	2019-07-08
URL	https://arxiv.org/abs/1907.03548v1
PDF	https://arxiv.org/pdf/1907.03548v1.pdf
PWC	https://paperswithcode.com/paper/unified-attentional-generative-adversarial
Repo
Framework

Efficient Surface-Aware Semi-Global Matching with Multi-View Plane-Sweep Sampling


Title	Efficient Surface-Aware Semi-Global Matching with Multi-View Plane-Sweep Sampling
Authors	Boitumelo Ruf, Thomas Pollok, Martin Weinmann
Abstract	Online augmentation of an oblique aerial image sequence with structural information is an essential aspect in the process of 3D scene interpretation and analysis. One key aspect in this is the efficient dense image matching and depth estimation. Here, the Semi-Global Matching (SGM) approach has proven to be one of the most widely used algorithms for efficient depth estimation, providing a good trade-off between accuracy and computational complexity. However, SGM only models a first-order smoothness assumption, thus favoring fronto-parallel surfaces. In this work, we present a hierarchical algorithm that allows for efficient depth and normal map estimation together with confidence measures for each estimate. Our algorithm relies on a plane-sweep multi-image matching followed by an extended SGM optimization that allows to incorporate local surface orientations, thus achieving more consistent and accurate estimates in areasmade up of slanted surfaces, inherent to oblique aerial imagery. We evaluate numerous configurations of our algorithm on two different datasets using an absolute and relative accuracy measure. In our evaluation, we show that the results of our approach are comparable to the ones achieved by refined Structure-from-Motion (SfM) pipelines, such as COLMAP, which are designed for offline processing. In contrast, however, our approach only considers a confined image bundle of an input sequence, thus allowing to perform an online and incremental computation at 1Hz-2Hz.
Tasks	Depth Estimation
Published	2019-09-21
URL	https://arxiv.org/abs/1909.09891v1
PDF	https://arxiv.org/pdf/1909.09891v1.pdf
PWC	https://paperswithcode.com/paper/190909891
Repo
Framework

Generative approach to unsupervised deep local learning


Title	Generative approach to unsupervised deep local learning
Authors	Changlu Chen, Chaoxi Niu, Xia Zhan, Kun Zhan
Abstract	Most existing feature learning methods optimize inflexible handcrafted features and the affinity matrix is constructed by shallow linear embedding methods. Different from these conventional methods, we pretrain a generative neural network by stacking convolutional autoencoders to learn the latent data representation and then construct an affinity graph with them as a prior. Based on the pretrained model and the constructed graph, we add a self-expressive layer to complete the generative model and then fine-tune it with a new loss function, including the reconstruction loss and a deliberately defined locality-preserving loss. The locality-preserving loss designed by the constructed affinity graph serves as prior to preserve the local structure during the fine-tuning stage, which in turn improves the quality of feature representation effectively. Furthermore, the self-expressive layer between the encoder and decoder is based on the assumption that each latent feature is a linear combination of other latent features, so the weighted combination coefficients of the self-expressive layer are used to construct a new refined affinity graph for representing the data structure. We conduct experiments on four datasets to demonstrate the superiority of the representation ability of our proposed model over the state-of-the-art methods.
Tasks
Published	2019-06-19
URL	https://arxiv.org/abs/1906.07947v2
PDF	https://arxiv.org/pdf/1906.07947v2.pdf
PWC	https://paperswithcode.com/paper/a-generative-approach-to-unsupervised-deep
Repo
Framework

RES-SE-NET: Boosting Performance of Resnets by Enhancing Bridge-connections


Title	RES-SE-NET: Boosting Performance of Resnets by Enhancing Bridge-connections
Authors	Varshaneya V, Balasubramanian S, Darshan Gera
Abstract	One of the ways to train deep neural networks effectively is to use residual connections. Residual connections can be classified as being either identity connections or bridge-connections with a reshaping convolution. Empirical observations on CIFAR-10 and CIFAR-100 datasets using a baseline Resnet model, with bridge-connections removed, have shown a significant reduction in accuracy. This reduction is due to lack of contribution, in the form of feature maps, by the bridge-connections. Hence bridge-connections are vital for Resnet. However, all feature maps in the bridge-connections are considered to be equally important. In this work, an upgraded architecture “Res-SE-Net” is proposed to further strengthen the contribution from the bridge-connections by quantifying the importance of each feature map and weighting them accordingly using Squeeze-and-Excitation (SE) block. It is demonstrated that Res-SE-Net generalizes much better than Resnet and SE-Resnet on the benchmark CIFAR-10 and CIFAR-100 datasets.
Tasks
Published	2019-02-16
URL	http://arxiv.org/abs/1902.06066v1
PDF	http://arxiv.org/pdf/1902.06066v1.pdf
PWC	https://paperswithcode.com/paper/res-se-net-boosting-performance-of-resnets-by
Repo
Framework

Neural Network-Inspired Analog-to-Digital Conversion to Achieve Super-Resolution with Low-Precision RRAM Devices


Title	Neural Network-Inspired Analog-to-Digital Conversion to Achieve Super-Resolution with Low-Precision RRAM Devices
Authors	Weidong Cao, Liu Ke, Ayan Chakrabarti, Xuan Zhang
Abstract	Recent works propose neural network- (NN-) inspired analog-to-digital converters (NNADCs) and demonstrate their great potentials in many emerging applications. These NNADCs often rely on resistive random-access memory (RRAM) devices to realize the NN operations and require high-precision RRAM cells (6~12-bit) to achieve a moderate quantization resolution (4~8-bit). Such optimistic assumption of RRAM resolution, however, is not supported by fabrication data of RRAM arrays in large-scale production process. In this paper, we propose an NN-inspired super-resolution ADC based on low-precision RRAM devices by taking the advantage of a co-design methodology that combines a pipelined hardware architecture with a custom NN training framework. Results obtained from SPICE simulations demonstrate that our method leads to robust design of a 14-bit super-resolution ADC using 3-bit RRAM devices with improved power and speed performance and competitive figure-of-merits (FoMs). In addition to the linear uniform quantization, the proposed ADC can also support configurable high-resolution nonlinear quantization with high conversion speed and low conversion energy, enabling future intelligent analog-to-information interfaces for near-sensor analytics and processing.
Tasks	Quantization, Super-Resolution
Published	2019-11-28
URL	https://arxiv.org/abs/1911.12815v1
PDF	https://arxiv.org/pdf/1911.12815v1.pdf
PWC	https://paperswithcode.com/paper/neural-network-inspired-analog-to-digital
Repo
Framework

Using Physics-Informed Super-Resolution Generative Adversarial Networks for Subgrid Modeling in Turbulent Reactive Flows


Title	Using Physics-Informed Super-Resolution Generative Adversarial Networks for Subgrid Modeling in Turbulent Reactive Flows
Authors	Mathis Bode, Michael Gauding, Zeyu Lian, Dominik Denker, Marco Davidovic, Konstantin Kleinheinz, Jenia Jitsev, Heinz Pitsch
Abstract	Turbulence is still one of the main challenges for accurately predicting reactive flows. Therefore, the development of new turbulence closures which can be applied to combustion problems is essential. Data-driven modeling has become very popular in many fields over the last years as large, often extensively labeled, datasets became available and training of large neural networks became possible on GPUs speeding up the learning process tremendously. However, the successful application of deep neural networks in fluid dynamics, for example for subgrid modeling in the context of large-eddy simulations (LESs), is still challenging. Reasons for this are the large amount of degrees of freedom in realistic flows, the high requirements with respect to accuracy and error robustness, as well as open questions, such as the generalization capability of trained neural networks in such high-dimensional, physics-constrained scenarios. This work presents a novel subgrid modeling approach based on a generative adversarial network (GAN), which is trained with unsupervised deep learning (DL) using adversarial and physics-informed losses. A two-step training method is used to improve the generalization capability, especially extrapolation, of the network. The novel approach gives good results in a priori as well as a posteriori tests with decaying turbulence including turbulent mixing. The applicability of the network in complex combustion scenarios is furthermore discussed by employing it to a reactive LES of the Spray A case defined by the Engine Combustion Network (ECN).
Tasks	Super-Resolution
Published	2019-11-26
URL	https://arxiv.org/abs/1911.11380v1
PDF	https://arxiv.org/pdf/1911.11380v1.pdf
PWC	https://paperswithcode.com/paper/using-physics-informed-super-resolution
Repo
Framework

Functional Regularisation for Continual Learning with Gaussian Processes


Title	Functional Regularisation for Continual Learning with Gaussian Processes
Authors	Michalis K. Titsias, Jonathan Schwarz, Alexander G. de G. Matthews, Razvan Pascanu, Yee Whye Teh
Abstract	We introduce a framework for Continual Learning (CL) based on Bayesian inference over the function space rather than the parameters of a deep neural network. This method, referred to as functional regularisation for Continual Learning, avoids forgetting a previous task by constructing and memorising an approximate posterior belief over the underlying task-specific function. To achieve this we rely on a Gaussian process obtained by treating the weights of the last layer of a neural network as random and Gaussian distributed. Then, the training algorithm sequentially encounters tasks and constructs posterior beliefs over the task-specific functions by using inducing point sparse Gaussian process methods. At each step a new task is first learnt and then a summary is constructed consisting of (i) inducing inputs – a fixed-size subset of the task inputs selected such that it optimally represents the task – and (ii) a posterior distribution over the function values at these inputs. This summary then regularises learning of future tasks, through Kullback-Leibler regularisation terms. Our method thus unites approaches focused on (pseudo-)rehearsal with those derived from a sequential Bayesian inference perspective in a principled way, leading to strong results on accepted benchmarks.
Tasks	Bayesian Inference, Continual Learning, Gaussian Processes, Omniglot
Published	2019-01-31
URL	https://arxiv.org/abs/1901.11356v4
PDF	https://arxiv.org/pdf/1901.11356v4.pdf
PWC	https://paperswithcode.com/paper/functional-regularisation-for-continual
Repo
Framework