May 7, 2019

3164 words 15 mins read

Paper Group AWR 88

Paper Group AWR 88

Semi-Supervised Learning via Sparse Label Propagation. Online Real-time Multiple Spatiotemporal Action Localisation and Prediction. Uncovering Causality from Multivariate Hawkes Integrated Cumulants. Precomputed Real-Time Texture Synthesis with Markovian Generative Adversarial Networks. Bi-modal First Impressions Recognition using Temporally Ordere …

Semi-Supervised Learning via Sparse Label Propagation

Title Semi-Supervised Learning via Sparse Label Propagation
Authors Alexander Jung, Alfred O. Hero III, Alexandru Mara, Saeed Jahromi
Abstract This work proposes a novel method for semi-supervised learning from partially labeled massive network-structured datasets, i.e., big data over networks. We model the underlying hypothesis, which relates data points to labels, as a graph signal, defined over some graph (network) structure intrinsic to the dataset. Following the key principle of supervised learning, i.e., similar inputs yield similar outputs, we require the graph signals induced by labels to have small total variation. Accordingly, we formulate the problem of learning the labels of data points as a non-smooth convex optimization problem which amounts to balancing between the empirical loss, i.e., the discrepancy with some partially available label information, and the smoothness quantified by the total variation of the learned graph signal. We solve this optimization problem by appealing to a recently proposed preconditioned variant of the popular primal-dual method by Pock and Chambolle, which results in a sparse label propagation algorithm. This learning algorithm allows for a highly scalable implementation as message passing over the underlying data graph. By applying concepts of compressed sensing to the learning problem, we are also able to provide a transparent sufficient condition on the underlying network structure such that accurate learning of the labels is possible. We also present an implementation of the message passing formulation allows for a highly scalable implementation in big data frameworks.
Tasks
Published 2016-12-05
URL http://arxiv.org/abs/1612.01414v4
PDF http://arxiv.org/pdf/1612.01414v4.pdf
PWC https://paperswithcode.com/paper/semi-supervised-learning-via-sparse-label
Repo https://github.com/oleksii-a/sparse_label_propagation
Framework none

Online Real-time Multiple Spatiotemporal Action Localisation and Prediction

Title Online Real-time Multiple Spatiotemporal Action Localisation and Prediction
Authors Gurkirt Singh, Suman Saha, Michael Sapienza, Philip Torr, Fabio Cuzzolin
Abstract We present a deep-learning framework for real-time multiple spatio-temporal (S/T) action localisation, classification and early prediction. Current state-of-the-art approaches work offline and are too slow to be useful in real- world settings. To overcome their limitations we introduce two major developments. Firstly, we adopt real-time SSD (Single Shot MultiBox Detector) convolutional neural networks to regress and classify detection boxes in each video frame potentially containing an action of interest. Secondly, we design an original and efficient online algorithm to incrementally construct and label `action tubes’ from the SSD frame level detections. As a result, our system is not only capable of performing S/T detection in real time, but can also perform early action prediction in an online fashion. We achieve new state-of-the-art results in both S/T action localisation and early action prediction on the challenging UCF101-24 and J-HMDB-21 benchmarks, even when compared to the top offline competitors. To the best of our knowledge, ours is the first real-time (up to 40fps) system able to perform online S/T action localisation and early action prediction on the untrimmed videos of UCF101-24. |
Tasks
Published 2016-11-25
URL http://arxiv.org/abs/1611.08563v6
PDF http://arxiv.org/pdf/1611.08563v6.pdf
PWC https://paperswithcode.com/paper/online-real-time-multiple-spatiotemporal
Repo https://github.com/gurkirt/corrected-UCF101-Annots
Framework none

Uncovering Causality from Multivariate Hawkes Integrated Cumulants

Title Uncovering Causality from Multivariate Hawkes Integrated Cumulants
Authors Massil Achab, Emmanuel Bacry, Stéphane Gaïffas, Iacopo Mastromatteo, Jean-Francois Muzy
Abstract We design a new nonparametric method that allows one to estimate the matrix of integrated kernels of a multivariate Hawkes process. This matrix not only encodes the mutual influences of each nodes of the process, but also disentangles the causality relationships between them. Our approach is the first that leads to an estimation of this matrix without any parametric modeling and estimation of the kernels themselves. A consequence is that it can give an estimation of causality relationships between nodes (or users), based on their activity timestamps (on a social network for instance), without knowing or estimating the shape of the activities lifetime. For that purpose, we introduce a moment matching method that fits the third-order integrated cumulants of the process. We show on numerical experiments that our approach is indeed very robust to the shape of the kernels, and gives appealing results on the MemeTracker database.
Tasks
Published 2016-07-21
URL http://arxiv.org/abs/1607.06333v3
PDF http://arxiv.org/pdf/1607.06333v3.pdf
PWC https://paperswithcode.com/paper/uncovering-causality-from-multivariate-hawkes
Repo https://github.com/achab/nphc
Framework none

Precomputed Real-Time Texture Synthesis with Markovian Generative Adversarial Networks

Title Precomputed Real-Time Texture Synthesis with Markovian Generative Adversarial Networks
Authors Chuan Li, Michael Wand
Abstract This paper proposes Markovian Generative Adversarial Networks (MGANs), a method for training generative neural networks for efficient texture synthesis. While deep neural network approaches have recently demonstrated remarkable results in terms of synthesis quality, they still come at considerable computational costs (minutes of run-time for low-res images). Our paper addresses this efficiency issue. Instead of a numerical deconvolution in previous work, we precompute a feed-forward, strided convolutional network that captures the feature statistics of Markovian patches and is able to directly generate outputs of arbitrary dimensions. Such network can directly decode brown noise to realistic texture, or photos to artistic paintings. With adversarial training, we obtain quality comparable to recent neural texture synthesis methods. As no optimization is required any longer at generation time, our run-time performance (0.25M pixel images at 25Hz) surpasses previous neural texture synthesizers by a significant margin (at least 500 times faster). We apply this idea to texture synthesis, style transfer, and video stylization.
Tasks Style Transfer, Texture Synthesis
Published 2016-04-15
URL http://arxiv.org/abs/1604.04382v1
PDF http://arxiv.org/pdf/1604.04382v1.pdf
PWC https://paperswithcode.com/paper/precomputed-real-time-texture-synthesis-with
Repo https://github.com/chuanli11/MGANs
Framework torch

Bi-modal First Impressions Recognition using Temporally Ordered Deep Audio and Stochastic Visual Features

Title Bi-modal First Impressions Recognition using Temporally Ordered Deep Audio and Stochastic Visual Features
Authors Arulkumar Subramaniam, Vismay Patel, Ashish Mishra, Prashanth Balasubramanian, Anurag Mittal
Abstract We propose a novel approach for First Impressions Recognition in terms of the Big Five personality-traits from short videos. The Big Five personality traits is a model to describe human personality using five broad categories: Extraversion, Agreeableness, Conscientiousness, Neuroticism and Openness. We train two bi-modal end-to-end deep neural network architectures using temporally ordered audio and novel stochastic visual features from few frames, without over-fitting. We empirically show that the trained models perform exceptionally well, even after training from a small sub-portions of inputs. Our method is evaluated in ChaLearn LAP 2016 Apparent Personality Analysis (APA) competition using ChaLearn LAP APA2016 dataset and achieved excellent performance.
Tasks
Published 2016-10-31
URL http://arxiv.org/abs/1610.10048v1
PDF http://arxiv.org/pdf/1610.10048v1.pdf
PWC https://paperswithcode.com/paper/bi-modal-first-impressions-recognition-using
Repo https://github.com/InnovArul/first-impressions
Framework torch

PoseTrack: Joint Multi-Person Pose Estimation and Tracking

Title PoseTrack: Joint Multi-Person Pose Estimation and Tracking
Authors Umar Iqbal, Anton Milan, Juergen Gall
Abstract In this work, we introduce the challenging problem of joint multi-person pose estimation and tracking of an unknown number of persons in unconstrained videos. Existing methods for multi-person pose estimation in images cannot be applied directly to this problem, since it also requires to solve the problem of person association over time in addition to the pose estimation for each person. We therefore propose a novel method that jointly models multi-person pose estimation and tracking in a single formulation. To this end, we represent body joint detections in a video by a spatio-temporal graph and solve an integer linear program to partition the graph into sub-graphs that correspond to plausible body pose trajectories for each person. The proposed approach implicitly handles occlusion and truncation of persons. Since the problem has not been addressed quantitatively in the literature, we introduce a challenging “Multi-Person PoseTrack” dataset, and also propose a completely unconstrained evaluation protocol that does not make any assumptions about the scale, size, location or the number of persons. Finally, we evaluate the proposed approach and several baseline methods on our new dataset.
Tasks Multi-Person Pose Estimation, Multi-Person Pose Estimation and Tracking, Pose Estimation
Published 2016-11-23
URL http://arxiv.org/abs/1611.07727v3
PDF http://arxiv.org/pdf/1611.07727v3.pdf
PWC https://paperswithcode.com/paper/posetrack-joint-multi-person-pose-estimation
Repo https://github.com/iqbalu/PoseTrack-CVPR2017
Framework none

Semantic Scene Completion from a Single Depth Image

Title Semantic Scene Completion from a Single Depth Image
Authors Shuran Song, Fisher Yu, Andy Zeng, Angel X. Chang, Manolis Savva, Thomas Funkhouser
Abstract This paper focuses on semantic scene completion, a task for producing a complete 3D voxel representation of volumetric occupancy and semantic labels for a scene from a single-view depth map observation. Previous work has considered scene completion and semantic labeling of depth maps separately. However, we observe that these two problems are tightly intertwined. To leverage the coupled nature of these two tasks, we introduce the semantic scene completion network (SSCNet), an end-to-end 3D convolutional network that takes a single depth image as input and simultaneously outputs occupancy and semantic labels for all voxels in the camera view frustum. Our network uses a dilation-based 3D context module to efficiently expand the receptive field and enable 3D context learning. To train our network, we construct SUNCG - a manually created large-scale dataset of synthetic 3D scenes with dense volumetric annotations. Our experiments demonstrate that the joint model outperforms methods addressing each task in isolation and outperforms alternative approaches on the semantic scene completion task.
Tasks
Published 2016-11-28
URL http://arxiv.org/abs/1611.08974v1
PDF http://arxiv.org/pdf/1611.08974v1.pdf
PWC https://paperswithcode.com/paper/semantic-scene-completion-from-a-single-depth
Repo https://github.com/facebookresearch/House3D
Framework none

Neural Machine Translation in Linear Time

Title Neural Machine Translation in Linear Time
Authors Nal Kalchbrenner, Lasse Espeholt, Karen Simonyan, Aaron van den Oord, Alex Graves, Koray Kavukcuoglu
Abstract We present a novel neural network for processing sequences. The ByteNet is a one-dimensional convolutional neural network that is composed of two parts, one to encode the source sequence and the other to decode the target sequence. The two network parts are connected by stacking the decoder on top of the encoder and preserving the temporal resolution of the sequences. To address the differing lengths of the source and the target, we introduce an efficient mechanism by which the decoder is dynamically unfolded over the representation of the encoder. The ByteNet uses dilation in the convolutional layers to increase its receptive field. The resulting network has two core properties: it runs in time that is linear in the length of the sequences and it sidesteps the need for excessive memorization. The ByteNet decoder attains state-of-the-art performance on character-level language modelling and outperforms the previous best results obtained with recurrent networks. The ByteNet also achieves state-of-the-art performance on character-to-character machine translation on the English-to-German WMT translation task, surpassing comparable neural translation models that are based on recurrent networks with attentional pooling and run in quadratic time. We find that the latent alignment structure contained in the representations reflects the expected alignment between the tokens.
Tasks Language Modelling, Machine Translation
Published 2016-10-31
URL http://arxiv.org/abs/1610.10099v2
PDF http://arxiv.org/pdf/1610.10099v2.pdf
PWC https://paperswithcode.com/paper/neural-machine-translation-in-linear-time
Repo https://github.com/paarthneekhara/byteNet-tensorflow
Framework tf

Estimating individual treatment effect: generalization bounds and algorithms

Title Estimating individual treatment effect: generalization bounds and algorithms
Authors Uri Shalit, Fredrik D. Johansson, David Sontag
Abstract There is intense interest in applying machine learning to problems of causal inference in fields such as healthcare, economics and education. In particular, individual-level causal inference has important applications such as precision medicine. We give a new theoretical analysis and family of algorithms for predicting individual treatment effect (ITE) from observational data, under the assumption known as strong ignorability. The algorithms learn a “balanced” representation such that the induced treated and control distributions look similar. We give a novel, simple and intuitive generalization-error bound showing that the expected ITE estimation error of a representation is bounded by a sum of the standard generalization-error of that representation and the distance between the treated and control distributions induced by the representation. We use Integral Probability Metrics to measure distances between distributions, deriving explicit bounds for the Wasserstein and Maximum Mean Discrepancy (MMD) distances. Experiments on real and simulated data show the new algorithms match or outperform the state-of-the-art.
Tasks Causal Inference
Published 2016-06-13
URL http://arxiv.org/abs/1606.03976v5
PDF http://arxiv.org/pdf/1606.03976v5.pdf
PWC https://paperswithcode.com/paper/estimating-individual-treatment-effect
Repo https://github.com/clinicalml/cfrnet
Framework tf

Practical Black-Box Attacks against Machine Learning

Title Practical Black-Box Attacks against Machine Learning
Authors Nicolas Papernot, Patrick McDaniel, Ian Goodfellow, Somesh Jha, Z. Berkay Celik, Ananthram Swami
Abstract Machine learning (ML) models, e.g., deep neural networks (DNNs), are vulnerable to adversarial examples: malicious inputs modified to yield erroneous model outputs, while appearing unmodified to human observers. Potential attacks include having malicious content like malware identified as legitimate or controlling vehicle behavior. Yet, all existing adversarial example attacks require knowledge of either the model internals or its training data. We introduce the first practical demonstration of an attacker controlling a remotely hosted DNN with no such knowledge. Indeed, the only capability of our black-box adversary is to observe labels given by the DNN to chosen inputs. Our attack strategy consists in training a local model to substitute for the target DNN, using inputs synthetically generated by an adversary and labeled by the target DNN. We use the local substitute to craft adversarial examples, and find that they are misclassified by the targeted DNN. To perform a real-world and properly-blinded evaluation, we attack a DNN hosted by MetaMind, an online deep learning API. We find that their DNN misclassifies 84.24% of the adversarial examples crafted with our substitute. We demonstrate the general applicability of our strategy to many ML techniques by conducting the same attack against models hosted by Amazon and Google, using logistic regression substitutes. They yield adversarial examples misclassified by Amazon and Google at rates of 96.19% and 88.94%. We also find that this black-box attack strategy is capable of evading defense strategies previously found to make adversarial example crafting harder.
Tasks
Published 2016-02-08
URL http://arxiv.org/abs/1602.02697v4
PDF http://arxiv.org/pdf/1602.02697v4.pdf
PWC https://paperswithcode.com/paper/practical-black-box-attacks-against-machine
Repo https://github.com/adrian-botta/understanding_adversarial_examples
Framework none

Language Modeling with Gated Convolutional Networks

Title Language Modeling with Gated Convolutional Networks
Authors Yann N. Dauphin, Angela Fan, Michael Auli, David Grangier
Abstract The pre-dominant approach to language modeling to date is based on recurrent neural networks. Their success on this task is often linked to their ability to capture unbounded context. In this paper we develop a finite context approach through stacked convolutions, which can be more efficient since they allow parallelization over sequential tokens. We propose a novel simplified gating mechanism that outperforms Oord et al (2016) and investigate the impact of key architectural decisions. The proposed approach achieves state-of-the-art on the WikiText-103 benchmark, even though it features long-term dependencies, as well as competitive results on the Google Billion Words benchmark. Our model reduces the latency to score a sentence by an order of magnitude compared to a recurrent baseline. To our knowledge, this is the first time a non-recurrent approach is competitive with strong recurrent models on these large scale language tasks.
Tasks Language Modelling
Published 2016-12-23
URL http://arxiv.org/abs/1612.08083v3
PDF http://arxiv.org/pdf/1612.08083v3.pdf
PWC https://paperswithcode.com/paper/language-modeling-with-gated-convolutional
Repo https://github.com/ifrit98/layer-glu
Framework none

Span-Based Constituency Parsing with a Structure-Label System and Provably Optimal Dynamic Oracles

Title Span-Based Constituency Parsing with a Structure-Label System and Provably Optimal Dynamic Oracles
Authors James Cross, Liang Huang
Abstract Parsing accuracy using efficient greedy transition systems has improved dramatically in recent years thanks to neural networks. Despite striking results in dependency parsing, however, neural models have not surpassed state-of-the-art approaches in constituency parsing. To remedy this, we introduce a new shift-reduce system whose stack contains merely sentence spans, represented by a bare minimum of LSTM features. We also design the first provably optimal dynamic oracle for constituency parsing, which runs in amortized O(1) time, compared to O(n^3) oracles for standard dependency parsing. Training with this oracle, we achieve the best F1 scores on both English and French of any parser that does not use reranking or external data.
Tasks Constituency Parsing, Dependency Parsing
Published 2016-12-20
URL http://arxiv.org/abs/1612.06475v1
PDF http://arxiv.org/pdf/1612.06475v1.pdf
PWC https://paperswithcode.com/paper/span-based-constituency-parsing-with-a
Repo https://github.com/jhcross/span-parser
Framework none

Operational Calculus for Differentiable Programming

Title Operational Calculus for Differentiable Programming
Authors Žiga Sajovic, Martin Vuk
Abstract In this work we present a theoretical model for differentiable programming. We construct an algebraic language that encapsulates formal semantics of differentiable programs by way of Operational Calculus. The algebraic nature of Operational Calculus can alter the properties of the programs that are expressed within the language and transform them into their solutions. In our model programs are elements of programming spaces and viewed as maps from the virtual memory space to itself. Virtual memory space is an algebra of programs, an algebraic data structure one can calculate with. We define the operator of differentiation ($\partial$) on programming spaces and, using its powers, implement the general shift operator and the operator of program composition. We provide the formula for the expansion of a differentiable program into an infinite tensor series in terms of the powers of $\partial$. We express the operator of program composition in terms of the generalized shift operator and $\partial$, which implements a differentiable composition in the language. Such operators serve as abstractions over the tensor series algebra, as main actors in our language. We demonstrate our models usefulness in differentiable programming by using it to analyse iterators, deriving fractional iterations and their iterating velocities, and explicitly solve the special case of ReduceSum.
Tasks
Published 2016-10-25
URL http://arxiv.org/abs/1610.07690v6
PDF http://arxiv.org/pdf/1610.07690v6.pdf
PWC https://paperswithcode.com/paper/operational-calculus-for-differentiable
Repo https://github.com/zigasajovic/dCpp
Framework none

Neural Photo Editing with Introspective Adversarial Networks

Title Neural Photo Editing with Introspective Adversarial Networks
Authors Andrew Brock, Theodore Lim, J. M. Ritchie, Nick Weston
Abstract The increasingly photorealistic sample quality of generative image models suggests their feasibility in applications beyond image generation. We present the Neural Photo Editor, an interface that leverages the power of generative neural networks to make large, semantically coherent changes to existing images. To tackle the challenge of achieving accurate reconstructions without loss of feature quality, we introduce the Introspective Adversarial Network, a novel hybridization of the VAE and GAN. Our model efficiently captures long-range dependencies through use of a computational block based on weight-shared dilated convolutions, and improves generalization performance with Orthogonal Regularization, a novel weight regularization method. We validate our contributions on CelebA, SVHN, and CIFAR-100, and produce samples and reconstructions with high visual fidelity.
Tasks Image Generation
Published 2016-09-22
URL http://arxiv.org/abs/1609.07093v3
PDF http://arxiv.org/pdf/1609.07093v3.pdf
PWC https://paperswithcode.com/paper/neural-photo-editing-with-introspective
Repo https://github.com/ajbrock/Neural-Photo-Editor
Framework tf

Full Resolution Image Compression with Recurrent Neural Networks

Title Full Resolution Image Compression with Recurrent Neural Networks
Authors George Toderici, Damien Vincent, Nick Johnston, Sung Jin Hwang, David Minnen, Joel Shor, Michele Covell
Abstract This paper presents a set of full-resolution lossy image compression methods based on neural networks. Each of the architectures we describe can provide variable compression rates during deployment without requiring retraining of the network: each network need only be trained once. All of our architectures consist of a recurrent neural network (RNN)-based encoder and decoder, a binarizer, and a neural network for entropy coding. We compare RNN types (LSTM, associative LSTM) and introduce a new hybrid of GRU and ResNet. We also study “one-shot” versus additive reconstruction architectures and introduce a new scaled-additive framework. We compare to previous work, showing improvements of 4.3%-8.8% AUC (area under the rate-distortion curve), depending on the perceptual metric used. As far as we know, this is the first neural network architecture that is able to outperform JPEG at image compression across most bitrates on the rate-distortion curve on the Kodak dataset images, with and without the aid of entropy coding.
Tasks Image Compression
Published 2016-08-18
URL http://arxiv.org/abs/1608.05148v2
PDF http://arxiv.org/pdf/1608.05148v2.pdf
PWC https://paperswithcode.com/paper/full-resolution-image-compression-with
Repo https://github.com/SimonTsungHanKuo/ImageCompzByGRU
Framework pytorch
comments powered by Disqus