May 7, 2019

2677 words 13 mins read

Paper Group AWR 70

Early Methods for Detecting Adversarial Images. FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks. Unrolled Generative Adversarial Networks. A Neural Approach to Blind Motion Deblurring. YouTube-8M: A Large-Scale Video Classification Benchmark. Deep Reinforcement Learning Radio Control and Signal Detection with KeRLym, a Gym RL A …

Early Methods for Detecting Adversarial Images


Title	Early Methods for Detecting Adversarial Images
Authors	Dan Hendrycks, Kevin Gimpel
Abstract	Many machine learning classifiers are vulnerable to adversarial perturbations. An adversarial perturbation modifies an input to change a classifier’s prediction without causing the input to seem substantially different to human perception. We deploy three methods to detect adversarial images. Adversaries trying to bypass our detectors must make the adversarial image less pathological or they will fail trying. Our best detection method reveals that adversarial images place abnormal emphasis on the lower-ranked principal components from PCA. Other detectors and a colorful saliency map are in an appendix.
Tasks
Published	2016-08-01
URL	http://arxiv.org/abs/1608.00530v2
PDF	http://arxiv.org/pdf/1608.00530v2.pdf
PWC	https://paperswithcode.com/paper/early-methods-for-detecting-adversarial
Repo	https://github.com/hendrycks/fooling
Framework	tf

FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks


Title	FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks
Authors	Eddy Ilg, Nikolaus Mayer, Tonmoy Saikia, Margret Keuper, Alexey Dosovitskiy, Thomas Brox
Abstract	The FlowNet demonstrated that optical flow estimation can be cast as a learning problem. However, the state of the art with regard to the quality of the flow has still been defined by traditional methods. Particularly on small displacements and real-world data, FlowNet cannot compete with variational methods. In this paper, we advance the concept of end-to-end learning of optical flow and make it work really well. The large improvements in quality and speed are caused by three major contributions: first, we focus on the training data and show that the schedule of presenting data during training is very important. Second, we develop a stacked architecture that includes warping of the second image with intermediate optical flow. Third, we elaborate on small displacements by introducing a sub-network specializing on small motions. FlowNet 2.0 is only marginally slower than the original FlowNet but decreases the estimation error by more than 50%. It performs on par with state-of-the-art methods, while running at interactive frame rates. Moreover, we present faster variants that allow optical flow computation at up to 140fps with accuracy matching the original FlowNet.
Tasks	Dense Pixel Correspondence Estimation, Optical Flow Estimation, Skeleton Based Action Recognition
Published	2016-12-06
URL	http://arxiv.org/abs/1612.01925v1
PDF	http://arxiv.org/pdf/1612.01925v1.pdf
PWC	https://paperswithcode.com/paper/flownet-20-evolution-of-optical-flow
Repo	https://github.com/rickyHong/tfoptflow-repl
Framework	tf

Unrolled Generative Adversarial Networks


Title	Unrolled Generative Adversarial Networks
Authors	Luke Metz, Ben Poole, David Pfau, Jascha Sohl-Dickstein
Abstract	We introduce a method to stabilize Generative Adversarial Networks (GANs) by defining the generator objective with respect to an unrolled optimization of the discriminator. This allows training to be adjusted between using the optimal discriminator in the generator’s objective, which is ideal but infeasible in practice, and using the current value of the discriminator, which is often unstable and leads to poor solutions. We show how this technique solves the common problem of mode collapse, stabilizes training of GANs with complex recurrent generators, and increases diversity and coverage of the data distribution by the generator.
Tasks
Published	2016-11-07
URL	http://arxiv.org/abs/1611.02163v4
PDF	http://arxiv.org/pdf/1611.02163v4.pdf
PWC	https://paperswithcode.com/paper/unrolled-generative-adversarial-networks
Repo	https://github.com/alex98chen/testGAN
Framework	tf


Title	A Neural Approach to Blind Motion Deblurring
Authors	Ayan Chakrabarti
Abstract	We present a new method for blind motion deblurring that uses a neural network trained to compute estimates of sharp image patches from observations that are blurred by an unknown motion kernel. Instead of regressing directly to patch intensities, this network learns to predict the complex Fourier coefficients of a deconvolution filter to be applied to the input patch for restoration. For inference, we apply the network independently to all overlapping patches in the observed image, and average its outputs to form an initial estimate of the sharp image. We then explicitly estimate a single global blur kernel by relating this estimate to the observed image, and finally perform non-blind deconvolution with this kernel. Our method exhibits accuracy and robustness close to state-of-the-art iterative methods, while being much faster when parallelized on GPU hardware.
Tasks	Deblurring
Published	2016-03-15
URL	http://arxiv.org/abs/1603.04771v2
PDF	http://arxiv.org/pdf/1603.04771v2.pdf
PWC	https://paperswithcode.com/paper/a-neural-approach-to-blind-motion-deblurring
Repo	https://github.com/ayanc/ndeblur
Framework	none

YouTube-8M: A Large-Scale Video Classification Benchmark


Title	YouTube-8M: A Large-Scale Video Classification Benchmark
Authors	Sami Abu-El-Haija, Nisarg Kothari, Joonseok Lee, Paul Natsev, George Toderici, Balakrishnan Varadarajan, Sudheendra Vijayanarasimhan
Abstract	Many recent advancements in Computer Vision are attributed to large datasets. Open-source software packages for Machine Learning and inexpensive commodity hardware have reduced the barrier of entry for exploring novel approaches at scale. It is possible to train models over millions of examples within a few days. Although large-scale datasets exist for image understanding, such as ImageNet, there are no comparable size video classification datasets. In this paper, we introduce YouTube-8M, the largest multi-label video classification dataset, composed of ~8 million videos (500K hours of video), annotated with a vocabulary of 4800 visual entities. To get the videos and their labels, we used a YouTube video annotation system, which labels videos with their main topics. While the labels are machine-generated, they have high-precision and are derived from a variety of human-based signals including metadata and query click signals. We filtered the video labels (Knowledge Graph entities) using both automated and manual curation strategies, including asking human raters if the labels are visually recognizable. Then, we decoded each video at one-frame-per-second, and used a Deep CNN pre-trained on ImageNet to extract the hidden representation immediately prior to the classification layer. Finally, we compressed the frame features and make both the features and video-level labels available for download. We trained various (modest) classification models on the dataset, evaluated them using popular evaluation metrics, and report them as baselines. Despite the size of the dataset, some of our models train to convergence in less than a day on a single machine using TensorFlow. We plan to release code for training a TensorFlow model and for computing metrics.
Tasks	Action Recognition In Videos, Video Classification
Published	2016-09-27
URL	http://arxiv.org/abs/1609.08675v1
PDF	http://arxiv.org/pdf/1609.08675v1.pdf
PWC	https://paperswithcode.com/paper/youtube-8m-a-large-scale-video-classification
Repo	https://github.com/taufikxu/youtube
Framework	tf

Deep Reinforcement Learning Radio Control and Signal Detection with KeRLym, a Gym RL Agent


Title	Deep Reinforcement Learning Radio Control and Signal Detection with KeRLym, a Gym RL Agent
Authors	Timothy J. O’Shea, T. Charles Clancy
Abstract	This paper presents research in progress investigating the viability and adaptation of reinforcement learning using deep neural network based function approximation for the task of radio control and signal detection in the wireless domain. We demonstrate a successful initial method for radio control which allows naive learning of search without the need for expert features, heuristics, or search strategies. We also introduce Kerlym, an open Keras based reinforcement learning agent collection for OpenAI’s Gym.
Tasks
Published	2016-05-30
URL	http://arxiv.org/abs/1605.09221v1
PDF	http://arxiv.org/pdf/1605.09221v1.pdf
PWC	https://paperswithcode.com/paper/deep-reinforcement-learning-radio-control-and
Repo	https://github.com/osh/kerlym
Framework	tf

Multilingual Part-of-Speech Tagging with Bidirectional Long Short-Term Memory Models and Auxiliary Loss


Title	Multilingual Part-of-Speech Tagging with Bidirectional Long Short-Term Memory Models and Auxiliary Loss
Authors	Barbara Plank, Anders Søgaard, Yoav Goldberg
Abstract	Bidirectional long short-term memory (bi-LSTM) networks have recently proven successful for various NLP sequence modeling tasks, but little is known about their reliance to input representations, target languages, data set size, and label noise. We address these issues and evaluate bi-LSTMs with word, character, and unicode byte embeddings for POS tagging. We compare bi-LSTMs to traditional POS taggers across languages and data sizes. We also present a novel bi-LSTM model, which combines the POS tagging loss function with an auxiliary loss function that accounts for rare words. The model obtains state-of-the-art performance across 22 languages, and works especially well for morphologically complex languages. Our analysis suggests that bi-LSTMs are less sensitive to training data size and label corruptions (at small noise levels) than previously assumed.
Tasks	Part-Of-Speech Tagging
Published	2016-04-19
URL	http://arxiv.org/abs/1604.05529v3
PDF	http://arxiv.org/pdf/1604.05529v3.pdf
PWC	https://paperswithcode.com/paper/multilingual-part-of-speech-tagging-with
Repo	https://github.com/timerstime/SDG4DA
Framework	tf

Recurrent switching linear dynamical systems


Title	Recurrent switching linear dynamical systems
Authors	Scott W. Linderman, Andrew C. Miller, Ryan P. Adams, David M. Blei, Liam Paninski, Matthew J. Johnson
Abstract	Many natural systems, such as neurons firing in the brain or basketball teams traversing a court, give rise to time series data with complex, nonlinear dynamics. We can gain insight into these systems by decomposing the data into segments that are each explained by simpler dynamic units. Building on switching linear dynamical systems (SLDS), we present a new model class that not only discovers these dynamical units, but also explains how their switching behavior depends on observations or continuous latent states. These “recurrent” switching linear dynamical systems provide further insight by discovering the conditions under which each unit is deployed, something that traditional SLDS models fail to do. We leverage recent algorithmic advances in approximate inference to make Bayesian inference in these models easy, fast, and scalable.
Tasks	Bayesian Inference, Time Series
Published	2016-10-26
URL	http://arxiv.org/abs/1610.08466v1
PDF	http://arxiv.org/pdf/1610.08466v1.pdf
PWC	https://paperswithcode.com/paper/recurrent-switching-linear-dynamical-systems
Repo	https://github.com/slinderman/pypolyagamma
Framework	none

Post Training in Deep Learning with Last Kernel


Title	Post Training in Deep Learning with Last Kernel
Authors	Thomas Moreau, Julien Audiffren
Abstract	One of the main challenges of deep learning methods is the choice of an appropriate training strategy. In particular, additional steps, such as unsupervised pre-training, have been shown to greatly improve the performances of deep structures. In this article, we propose an extra training step, called post-training, which only optimizes the last layer of the network. We show that this procedure can be analyzed in the context of kernel theory, with the first layers computing an embedding of the data and the last layer a statistical model to solve the task based on this embedding. This step makes sure that the embedding, or representation, of the data is used in the best possible way for the considered task. This idea is then tested on multiple architectures with various data sets, showing that it consistently provides a boost in performance.
Tasks
Published	2016-11-14
URL	http://arxiv.org/abs/1611.04499v2
PDF	http://arxiv.org/pdf/1611.04499v2.pdf
PWC	https://paperswithcode.com/paper/post-training-in-deep-learning-with-last
Repo	https://github.com/tomMoral/post_training
Framework	tf

Ultimate tensorization: compressing convolutional and FC layers alike


Title	Ultimate tensorization: compressing convolutional and FC layers alike
Authors	Timur Garipov, Dmitry Podoprikhin, Alexander Novikov, Dmitry Vetrov
Abstract	Convolutional neural networks excel in image recognition tasks, but this comes at the cost of high computational and memory complexity. To tackle this problem, [1] developed a tensor factorization framework to compress fully-connected layers. In this paper, we focus on compressing convolutional layers. We show that while the direct application of the tensor framework [1] to the 4-dimensional kernel of convolution does compress the layer, we can do better. We reshape the convolutional kernel into a tensor of higher order and factorize it. We combine the proposed approach with the previous work to compress both convolutional and fully-connected layers of a network and achieve 80x network compression rate with 1.1% accuracy drop on the CIFAR-10 dataset.
Tasks
Published	2016-11-10
URL	http://arxiv.org/abs/1611.03214v1
PDF	http://arxiv.org/pdf/1611.03214v1.pdf
PWC	https://paperswithcode.com/paper/ultimate-tensorization-compressing
Repo	https://github.com/timgaripov/TensorNet-TF
Framework	tf

Sample Efficient Actor-Critic with Experience Replay


Title	Sample Efficient Actor-Critic with Experience Replay
Authors	Ziyu Wang, Victor Bapst, Nicolas Heess, Volodymyr Mnih, Remi Munos, Koray Kavukcuoglu, Nando de Freitas
Abstract	This paper presents an actor-critic deep reinforcement learning agent with experience replay that is stable, sample efficient, and performs remarkably well on challenging environments, including the discrete 57-game Atari domain and several continuous control problems. To achieve this, the paper introduces several innovations, including truncated importance sampling with bias correction, stochastic dueling network architectures, and a new trust region policy optimization method.
Tasks	Continuous Control
Published	2016-11-03
URL	http://arxiv.org/abs/1611.01224v2
PDF	http://arxiv.org/pdf/1611.01224v2.pdf
PWC	https://paperswithcode.com/paper/sample-efficient-actor-critic-with-experience
Repo	https://github.com/neilsgp/RL-Algorithms
Framework	none

Multi-layer Representation Learning for Medical Concepts


Title	Multi-layer Representation Learning for Medical Concepts
Authors	Edward Choi, Mohammad Taha Bahadori, Elizabeth Searles, Catherine Coffey, Jimeng Sun
Abstract	Learning efficient representations for concepts has been proven to be an important basis for many applications such as machine translation or document classification. Proper representations of medical concepts such as diagnosis, medication, procedure codes and visits will have broad applications in healthcare analytics. However, in Electronic Health Records (EHR) the visit sequences of patients include multiple concepts (diagnosis, procedure, and medication codes) per visit. This structure provides two types of relational information, namely sequential order of visits and co-occurrence of the codes within each visit. In this work, we propose Med2Vec, which not only learns distributed representations for both medical codes and visits from a large EHR dataset with over 3 million visits, but also allows us to interpret the learned representations confirmed positively by clinical experts. In the experiments, Med2Vec displays significant improvement in key medical applications compared to popular baselines such as Skip-gram, GloVe and stacked autoencoder, while providing clinically meaningful interpretation.
Tasks	Document Classification, Machine Translation, Medical Diagnosis, Representation Learning
Published	2016-02-17
URL	http://arxiv.org/abs/1602.05568v1
PDF	http://arxiv.org/pdf/1602.05568v1.pdf
PWC	https://paperswithcode.com/paper/multi-layer-representation-learning-for
Repo	https://github.com/mp2893/med2vec
Framework	none

PVANet: Lightweight Deep Neural Networks for Real-time Object Detection


Title	PVANet: Lightweight Deep Neural Networks for Real-time Object Detection
Authors	Sanghoon Hong, Byungseok Roh, Kye-Hyeon Kim, Yeongjae Cheon, Minje Park
Abstract	In object detection, reducing computational cost is as important as improving accuracy for most practical usages. This paper proposes a novel network structure, which is an order of magnitude lighter than other state-of-the-art networks while maintaining the accuracy. Based on the basic principle of more layers with less channels, this new deep neural network minimizes its redundancy by adopting recent innovations including C.ReLU and Inception structure. We also show that this network can be trained efficiently to achieve solid results on well-known object detection benchmarks: 84.9% and 84.2% mAP on VOC2007 and VOC2012 while the required compute is less than 10% of the recent ResNet-101.
Tasks	Object Detection, Real-Time Object Detection
Published	2016-11-23
URL	http://arxiv.org/abs/1611.08588v2
PDF	http://arxiv.org/pdf/1611.08588v2.pdf
PWC	https://paperswithcode.com/paper/pvanet-lightweight-deep-neural-networks-for
Repo	https://github.com/busyboxs/Some-resources-useful-for-me
Framework	tf

Relief R-CNN : Utilizing Convolutional Features for Fast Object Detection


Title	Relief R-CNN : Utilizing Convolutional Features for Fast Object Detection
Authors	Guiying Li, Junlong Liu, Chunhui Jiang, Liangpeng Zhang, Minlong Lin, Ke Tang
Abstract	R-CNN style methods are sorts of the state-of-the-art object detection methods, which consist of region proposal generation and deep CNN classification. However, the proposal generation phase in this paradigm is usually time consuming, which would slow down the whole detection time in testing. This paper suggests that the value discrepancies among features in deep convolutional feature maps contain plenty of useful spatial information, and proposes a simple approach to extract the information for fast region proposal generation in testing. The proposed method, namely Relief R-CNN (R2-CNN), adopts a novel region proposal generator in a trained R-CNN style model. The new generator directly generates proposals from convolutional features by some simple rules, thus resulting in a much faster proposal generation speed and a lower demand of computation resources. Empirical studies show that R2-CNN could achieve the fastest detection speed with comparable accuracy among all the compared algorithms in testing.
Tasks	Object Detection, Real-Time Object Detection
Published	2016-01-25
URL	http://arxiv.org/abs/1601.06719v4
PDF	http://arxiv.org/pdf/1601.06719v4.pdf
PWC	https://paperswithcode.com/paper/relief-r-cnn-utilizing-convolutional-features
Repo	https://github.com/IdiosyncraticDragon/relief_rcnn
Framework	none

Predictive Business Process Monitoring with LSTM Neural Networks


Title	Predictive Business Process Monitoring with LSTM Neural Networks
Authors	Niek Tax, Ilya Verenich, Marcello La Rosa, Marlon Dumas
Abstract	Predictive business process monitoring methods exploit logs of completed cases of a process in order to make predictions about running cases thereof. Existing methods in this space are tailor-made for specific prediction tasks. Moreover, their relative accuracy is highly sensitive to the dataset at hand, thus requiring users to engage in trial-and-error and tuning when applying them in a specific setting. This paper investigates Long Short-Term Memory (LSTM) neural networks as an approach to build consistently accurate models for a wide range of predictive process monitoring tasks. First, we show that LSTMs outperform existing techniques to predict the next event of a running case and its timestamp. Next, we show how to use models for predicting the next task in order to predict the full continuation of a running case. Finally, we apply the same approach to predict the remaining time, and show that this approach outperforms existing tailor-made methods.
Tasks	Multivariate Time Series Forecasting, Time Series Prediction
Published	2016-12-07
URL	http://arxiv.org/abs/1612.02130v2
PDF	http://arxiv.org/pdf/1612.02130v2.pdf
PWC	https://paperswithcode.com/paper/predictive-business-process-monitoring-with
Repo	https://github.com/verenich/ProcessSequencePrediction
Framework	none