May 7, 2019

2677 words 13 mins read

Paper Group AWR 70

Paper Group AWR 70

Early Methods for Detecting Adversarial Images. FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks. Unrolled Generative Adversarial Networks. A Neural Approach to Blind Motion Deblurring. YouTube-8M: A Large-Scale Video Classification Benchmark. Deep Reinforcement Learning Radio Control and Signal Detection with KeRLym, a Gym RL A …

Early Methods for Detecting Adversarial Images

Title Early Methods for Detecting Adversarial Images
Authors Dan Hendrycks, Kevin Gimpel
Abstract Many machine learning classifiers are vulnerable to adversarial perturbations. An adversarial perturbation modifies an input to change a classifier’s prediction without causing the input to seem substantially different to human perception. We deploy three methods to detect adversarial images. Adversaries trying to bypass our detectors must make the adversarial image less pathological or they will fail trying. Our best detection method reveals that adversarial images place abnormal emphasis on the lower-ranked principal components from PCA. Other detectors and a colorful saliency map are in an appendix.
Tasks
Published 2016-08-01
URL http://arxiv.org/abs/1608.00530v2
PDF http://arxiv.org/pdf/1608.00530v2.pdf
PWC https://paperswithcode.com/paper/early-methods-for-detecting-adversarial
Repo https://github.com/hendrycks/fooling
Framework tf

FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks

Title FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks
Authors Eddy Ilg, Nikolaus Mayer, Tonmoy Saikia, Margret Keuper, Alexey Dosovitskiy, Thomas Brox
Abstract The FlowNet demonstrated that optical flow estimation can be cast as a learning problem. However, the state of the art with regard to the quality of the flow has still been defined by traditional methods. Particularly on small displacements and real-world data, FlowNet cannot compete with variational methods. In this paper, we advance the concept of end-to-end learning of optical flow and make it work really well. The large improvements in quality and speed are caused by three major contributions: first, we focus on the training data and show that the schedule of presenting data during training is very important. Second, we develop a stacked architecture that includes warping of the second image with intermediate optical flow. Third, we elaborate on small displacements by introducing a sub-network specializing on small motions. FlowNet 2.0 is only marginally slower than the original FlowNet but decreases the estimation error by more than 50%. It performs on par with state-of-the-art methods, while running at interactive frame rates. Moreover, we present faster variants that allow optical flow computation at up to 140fps with accuracy matching the original FlowNet.
Tasks Dense Pixel Correspondence Estimation, Optical Flow Estimation, Skeleton Based Action Recognition
Published 2016-12-06
URL http://arxiv.org/abs/1612.01925v1
PDF http://arxiv.org/pdf/1612.01925v1.pdf
PWC https://paperswithcode.com/paper/flownet-20-evolution-of-optical-flow
Repo https://github.com/rickyHong/tfoptflow-repl
Framework tf

Unrolled Generative Adversarial Networks

Title Unrolled Generative Adversarial Networks
Authors Luke Metz, Ben Poole, David Pfau, Jascha Sohl-Dickstein
Abstract We introduce a method to stabilize Generative Adversarial Networks (GANs) by defining the generator objective with respect to an unrolled optimization of the discriminator. This allows training to be adjusted between using the optimal discriminator in the generator’s objective, which is ideal but infeasible in practice, and using the current value of the discriminator, which is often unstable and leads to poor solutions. We show how this technique solves the common problem of mode collapse, stabilizes training of GANs with complex recurrent generators, and increases diversity and coverage of the data distribution by the generator.
Tasks
Published 2016-11-07
URL http://arxiv.org/abs/1611.02163v4
PDF http://arxiv.org/pdf/1611.02163v4.pdf
PWC https://paperswithcode.com/paper/unrolled-generative-adversarial-networks
Repo https://github.com/alex98chen/testGAN
Framework tf

A Neural Approach to Blind Motion Deblurring

Title A Neural Approach to Blind Motion Deblurring
Authors Ayan Chakrabarti
Abstract We present a new method for blind motion deblurring that uses a neural network trained to compute estimates of sharp image patches from observations that are blurred by an unknown motion kernel. Instead of regressing directly to patch intensities, this network learns to predict the complex Fourier coefficients of a deconvolution filter to be applied to the input patch for restoration. For inference, we apply the network independently to all overlapping patches in the observed image, and average its outputs to form an initial estimate of the sharp image. We then explicitly estimate a single global blur kernel by relating this estimate to the observed image, and finally perform non-blind deconvolution with this kernel. Our method exhibits accuracy and robustness close to state-of-the-art iterative methods, while being much faster when parallelized on GPU hardware.
Tasks Deblurring
Published 2016-03-15
URL http://arxiv.org/abs/1603.04771v2
PDF http://arxiv.org/pdf/1603.04771v2.pdf
PWC https://paperswithcode.com/paper/a-neural-approach-to-blind-motion-deblurring
Repo https://github.com/ayanc/ndeblur
Framework none

YouTube-8M: A Large-Scale Video Classification Benchmark

Title YouTube-8M: A Large-Scale Video Classification Benchmark
Authors Sami Abu-El-Haija, Nisarg Kothari, Joonseok Lee, Paul Natsev, George Toderici, Balakrishnan Varadarajan, Sudheendra Vijayanarasimhan
Abstract Many recent advancements in Computer Vision are attributed to large datasets. Open-source software packages for Machine Learning and inexpensive commodity hardware have reduced the barrier of entry for exploring novel approaches at scale. It is possible to train models over millions of examples within a few days. Although large-scale datasets exist for image understanding, such as ImageNet, there are no comparable size video classification datasets. In this paper, we introduce YouTube-8M, the largest multi-label video classification dataset, composed of ~8 million videos (500K hours of video), annotated with a vocabulary of 4800 visual entities. To get the videos and their labels, we used a YouTube video annotation system, which labels videos with their main topics. While the labels are machine-generated, they have high-precision and are derived from a variety of human-based signals including metadata and query click signals. We filtered the video labels (Knowledge Graph entities) using both automated and manual curation strategies, including asking human raters if the labels are visually recognizable. Then, we decoded each video at one-frame-per-second, and used a Deep CNN pre-trained on ImageNet to extract the hidden representation immediately prior to the classification layer. Finally, we compressed the frame features and make both the features and video-level labels available for download. We trained various (modest) classification models on the dataset, evaluated them using popular evaluation metrics, and report them as baselines. Despite the size of the dataset, some of our models train to convergence in less than a day on a single machine using TensorFlow. We plan to release code for training a TensorFlow model and for computing metrics.
Tasks Action Recognition In Videos, Video Classification
Published 2016-09-27
URL http://arxiv.org/abs/1609.08675v1
PDF http://arxiv.org/pdf/1609.08675v1.pdf
PWC https://paperswithcode.com/paper/youtube-8m-a-large-scale-video-classification
Repo https://github.com/taufikxu/youtube
Framework tf

Deep Reinforcement Learning Radio Control and Signal Detection with KeRLym, a Gym RL Agent

Title Deep Reinforcement Learning Radio Control and Signal Detection with KeRLym, a Gym RL Agent
Authors Timothy J. O’Shea, T. Charles Clancy
Abstract This paper presents research in progress investigating the viability and adaptation of reinforcement learning using deep neural network based function approximation for the task of radio control and signal detection in the wireless domain. We demonstrate a successful initial method for radio control which allows naive learning of search without the need for expert features, heuristics, or search strategies. We also introduce Kerlym, an open Keras based reinforcement learning agent collection for OpenAI’s Gym.
Tasks
Published 2016-05-30
URL http://arxiv.org/abs/1605.09221v1
PDF http://arxiv.org/pdf/1605.09221v1.pdf
PWC https://paperswithcode.com/paper/deep-reinforcement-learning-radio-control-and
Repo https://github.com/osh/kerlym
Framework tf

Multilingual Part-of-Speech Tagging with Bidirectional Long Short-Term Memory Models and Auxiliary Loss

Title Multilingual Part-of-Speech Tagging with Bidirectional Long Short-Term Memory Models and Auxiliary Loss
Authors Barbara Plank, Anders Søgaard, Yoav Goldberg
Abstract Bidirectional long short-term memory (bi-LSTM) networks have recently proven successful for various NLP sequence modeling tasks, but little is known about their reliance to input representations, target languages, data set size, and label noise. We address these issues and evaluate bi-LSTMs with word, character, and unicode byte embeddings for POS tagging. We compare bi-LSTMs to traditional POS taggers across languages and data sizes. We also present a novel bi-LSTM model, which combines the POS tagging loss function with an auxiliary loss function that accounts for rare words. The model obtains state-of-the-art performance across 22 languages, and works especially well for morphologically complex languages. Our analysis suggests that bi-LSTMs are less sensitive to training data size and label corruptions (at small noise levels) than previously assumed.
Tasks Part-Of-Speech Tagging
Published 2016-04-19
URL http://arxiv.org/abs/1604.05529v3
PDF http://arxiv.org/pdf/1604.05529v3.pdf
PWC https://paperswithcode.com/paper/multilingual-part-of-speech-tagging-with
Repo https://github.com/timerstime/SDG4DA
Framework tf

Recurrent switching linear dynamical systems

Title Recurrent switching linear dynamical systems
Authors Scott W. Linderman, Andrew C. Miller, Ryan P. Adams, David M. Blei, Liam Paninski, Matthew J. Johnson
Abstract Many natural systems, such as neurons firing in the brain or basketball teams traversing a court, give rise to time series data with complex, nonlinear dynamics. We can gain insight into these systems by decomposing the data into segments that are each explained by simpler dynamic units. Building on switching linear dynamical systems (SLDS), we present a new model class that not only discovers these dynamical units, but also explains how their switching behavior depends on observations or continuous latent states. These “recurrent” switching linear dynamical systems provide further insight by discovering the conditions under which each unit is deployed, something that traditional SLDS models fail to do. We leverage recent algorithmic advances in approximate inference to make Bayesian inference in these models easy, fast, and scalable.
Tasks Bayesian Inference, Time Series
Published 2016-10-26
URL http://arxiv.org/abs/1610.08466v1
PDF http://arxiv.org/pdf/1610.08466v1.pdf
PWC https://paperswithcode.com/paper/recurrent-switching-linear-dynamical-systems
Repo https://github.com/slinderman/pypolyagamma
Framework none

Post Training in Deep Learning with Last Kernel

Title Post Training in Deep Learning with Last Kernel
Authors Thomas Moreau, Julien Audiffren
Abstract One of the main challenges of deep learning methods is the choice of an appropriate training strategy. In particular, additional steps, such as unsupervised pre-training, have been shown to greatly improve the performances of deep structures. In this article, we propose an extra training step, called post-training, which only optimizes the last layer of the network. We show that this procedure can be analyzed in the context of kernel theory, with the first layers computing an embedding of the data and the last layer a statistical model to solve the task based on this embedding. This step makes sure that the embedding, or representation, of the data is used in the best possible way for the considered task. This idea is then tested on multiple architectures with various data sets, showing that it consistently provides a boost in performance.
Tasks
Published 2016-11-14
URL http://arxiv.org/abs/1611.04499v2
PDF http://arxiv.org/pdf/1611.04499v2.pdf
PWC https://paperswithcode.com/paper/post-training-in-deep-learning-with-last
Repo https://github.com/tomMoral/post_training
Framework tf

Ultimate tensorization: compressing convolutional and FC layers alike

Title Ultimate tensorization: compressing convolutional and FC layers alike
Authors Timur Garipov, Dmitry Podoprikhin, Alexander Novikov, Dmitry Vetrov
Abstract Convolutional neural networks excel in image recognition tasks, but this comes at the cost of high computational and memory complexity. To tackle this problem, [1] developed a tensor factorization framework to compress fully-connected layers. In this paper, we focus on compressing convolutional layers. We show that while the direct application of the tensor framework [1] to the 4-dimensional kernel of convolution does compress the layer, we can do better. We reshape the convolutional kernel into a tensor of higher order and factorize it. We combine the proposed approach with the previous work to compress both convolutional and fully-connected layers of a network and achieve 80x network compression rate with 1.1% accuracy drop on the CIFAR-10 dataset.
Tasks
Published 2016-11-10
URL http://arxiv.org/abs/1611.03214v1
PDF http://arxiv.org/pdf/1611.03214v1.pdf
PWC https://paperswithcode.com/paper/ultimate-tensorization-compressing
Repo https://github.com/timgaripov/TensorNet-TF
Framework tf

Sample Efficient Actor-Critic with Experience Replay

Title Sample Efficient Actor-Critic with Experience Replay
Authors Ziyu Wang, Victor Bapst, Nicolas Heess, Volodymyr Mnih, Remi Munos, Koray Kavukcuoglu, Nando de Freitas
Abstract This paper presents an actor-critic deep reinforcement learning agent with experience replay that is stable, sample efficient, and performs remarkably well on challenging environments, including the discrete 57-game Atari domain and several continuous control problems. To achieve this, the paper introduces several innovations, including truncated importance sampling with bias correction, stochastic dueling network architectures, and a new trust region policy optimization method.
Tasks Continuous Control
Published 2016-11-03
URL http://arxiv.org/abs/1611.01224v2
PDF http://arxiv.org/pdf/1611.01224v2.pdf
PWC https://paperswithcode.com/paper/sample-efficient-actor-critic-with-experience
Repo https://github.com/neilsgp/RL-Algorithms
Framework none

Multi-layer Representation Learning for Medical Concepts

Title Multi-layer Representation Learning for Medical Concepts
Authors Edward Choi, Mohammad Taha Bahadori, Elizabeth Searles, Catherine Coffey, Jimeng Sun
Abstract Learning efficient representations for concepts has been proven to be an important basis for many applications such as machine translation or document classification. Proper representations of medical concepts such as diagnosis, medication, procedure codes and visits will have broad applications in healthcare analytics. However, in Electronic Health Records (EHR) the visit sequences of patients include multiple concepts (diagnosis, procedure, and medication codes) per visit. This structure provides two types of relational information, namely sequential order of visits and co-occurrence of the codes within each visit. In this work, we propose Med2Vec, which not only learns distributed representations for both medical codes and visits from a large EHR dataset with over 3 million visits, but also allows us to interpret the learned representations confirmed positively by clinical experts. In the experiments, Med2Vec displays significant improvement in key medical applications compared to popular baselines such as Skip-gram, GloVe and stacked autoencoder, while providing clinically meaningful interpretation.
Tasks Document Classification, Machine Translation, Medical Diagnosis, Representation Learning
Published 2016-02-17
URL http://arxiv.org/abs/1602.05568v1
PDF http://arxiv.org/pdf/1602.05568v1.pdf
PWC https://paperswithcode.com/paper/multi-layer-representation-learning-for
Repo https://github.com/mp2893/med2vec
Framework none

PVANet: Lightweight Deep Neural Networks for Real-time Object Detection

Title PVANet: Lightweight Deep Neural Networks for Real-time Object Detection
Authors Sanghoon Hong, Byungseok Roh, Kye-Hyeon Kim, Yeongjae Cheon, Minje Park
Abstract In object detection, reducing computational cost is as important as improving accuracy for most practical usages. This paper proposes a novel network structure, which is an order of magnitude lighter than other state-of-the-art networks while maintaining the accuracy. Based on the basic principle of more layers with less channels, this new deep neural network minimizes its redundancy by adopting recent innovations including C.ReLU and Inception structure. We also show that this network can be trained efficiently to achieve solid results on well-known object detection benchmarks: 84.9% and 84.2% mAP on VOC2007 and VOC2012 while the required compute is less than 10% of the recent ResNet-101.
Tasks Object Detection, Real-Time Object Detection
Published 2016-11-23
URL http://arxiv.org/abs/1611.08588v2
PDF http://arxiv.org/pdf/1611.08588v2.pdf
PWC https://paperswithcode.com/paper/pvanet-lightweight-deep-neural-networks-for
Repo https://github.com/busyboxs/Some-resources-useful-for-me
Framework tf

Relief R-CNN : Utilizing Convolutional Features for Fast Object Detection

Title Relief R-CNN : Utilizing Convolutional Features for Fast Object Detection
Authors Guiying Li, Junlong Liu, Chunhui Jiang, Liangpeng Zhang, Minlong Lin, Ke Tang
Abstract R-CNN style methods are sorts of the state-of-the-art object detection methods, which consist of region proposal generation and deep CNN classification. However, the proposal generation phase in this paradigm is usually time consuming, which would slow down the whole detection time in testing. This paper suggests that the value discrepancies among features in deep convolutional feature maps contain plenty of useful spatial information, and proposes a simple approach to extract the information for fast region proposal generation in testing. The proposed method, namely Relief R-CNN (R2-CNN), adopts a novel region proposal generator in a trained R-CNN style model. The new generator directly generates proposals from convolutional features by some simple rules, thus resulting in a much faster proposal generation speed and a lower demand of computation resources. Empirical studies show that R2-CNN could achieve the fastest detection speed with comparable accuracy among all the compared algorithms in testing.
Tasks Object Detection, Real-Time Object Detection
Published 2016-01-25
URL http://arxiv.org/abs/1601.06719v4
PDF http://arxiv.org/pdf/1601.06719v4.pdf
PWC https://paperswithcode.com/paper/relief-r-cnn-utilizing-convolutional-features
Repo https://github.com/IdiosyncraticDragon/relief_rcnn
Framework none

Predictive Business Process Monitoring with LSTM Neural Networks

Title Predictive Business Process Monitoring with LSTM Neural Networks
Authors Niek Tax, Ilya Verenich, Marcello La Rosa, Marlon Dumas
Abstract Predictive business process monitoring methods exploit logs of completed cases of a process in order to make predictions about running cases thereof. Existing methods in this space are tailor-made for specific prediction tasks. Moreover, their relative accuracy is highly sensitive to the dataset at hand, thus requiring users to engage in trial-and-error and tuning when applying them in a specific setting. This paper investigates Long Short-Term Memory (LSTM) neural networks as an approach to build consistently accurate models for a wide range of predictive process monitoring tasks. First, we show that LSTMs outperform existing techniques to predict the next event of a running case and its timestamp. Next, we show how to use models for predicting the next task in order to predict the full continuation of a running case. Finally, we apply the same approach to predict the remaining time, and show that this approach outperforms existing tailor-made methods.
Tasks Multivariate Time Series Forecasting, Time Series Prediction
Published 2016-12-07
URL http://arxiv.org/abs/1612.02130v2
PDF http://arxiv.org/pdf/1612.02130v2.pdf
PWC https://paperswithcode.com/paper/predictive-business-process-monitoring-with
Repo https://github.com/verenich/ProcessSequencePrediction
Framework none
comments powered by Disqus