Paper Group AWR 70
Early Methods for Detecting Adversarial Images. FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks. Unrolled Generative Adversarial Networks. A Neural Approach to Blind Motion Deblurring. YouTube-8M: A Large-Scale Video Classification Benchmark. Deep Reinforcement Learning Radio Control and Signal Detection with KeRLym, a Gym RL A …
Early Methods for Detecting Adversarial Images
Title | Early Methods for Detecting Adversarial Images |
Authors | Dan Hendrycks, Kevin Gimpel |
Abstract | Many machine learning classifiers are vulnerable to adversarial perturbations. An adversarial perturbation modifies an input to change a classifier’s prediction without causing the input to seem substantially different to human perception. We deploy three methods to detect adversarial images. Adversaries trying to bypass our detectors must make the adversarial image less pathological or they will fail trying. Our best detection method reveals that adversarial images place abnormal emphasis on the lower-ranked principal components from PCA. Other detectors and a colorful saliency map are in an appendix. |
Tasks | |
Published | 2016-08-01 |
URL | http://arxiv.org/abs/1608.00530v2 |
http://arxiv.org/pdf/1608.00530v2.pdf | |
PWC | https://paperswithcode.com/paper/early-methods-for-detecting-adversarial |
Repo | https://github.com/hendrycks/fooling |
Framework | tf |
FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks
Title | FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks |
Authors | Eddy Ilg, Nikolaus Mayer, Tonmoy Saikia, Margret Keuper, Alexey Dosovitskiy, Thomas Brox |
Abstract | The FlowNet demonstrated that optical flow estimation can be cast as a learning problem. However, the state of the art with regard to the quality of the flow has still been defined by traditional methods. Particularly on small displacements and real-world data, FlowNet cannot compete with variational methods. In this paper, we advance the concept of end-to-end learning of optical flow and make it work really well. The large improvements in quality and speed are caused by three major contributions: first, we focus on the training data and show that the schedule of presenting data during training is very important. Second, we develop a stacked architecture that includes warping of the second image with intermediate optical flow. Third, we elaborate on small displacements by introducing a sub-network specializing on small motions. FlowNet 2.0 is only marginally slower than the original FlowNet but decreases the estimation error by more than 50%. It performs on par with state-of-the-art methods, while running at interactive frame rates. Moreover, we present faster variants that allow optical flow computation at up to 140fps with accuracy matching the original FlowNet. |
Tasks | Dense Pixel Correspondence Estimation, Optical Flow Estimation, Skeleton Based Action Recognition |
Published | 2016-12-06 |
URL | http://arxiv.org/abs/1612.01925v1 |
http://arxiv.org/pdf/1612.01925v1.pdf | |
PWC | https://paperswithcode.com/paper/flownet-20-evolution-of-optical-flow |
Repo | https://github.com/rickyHong/tfoptflow-repl |
Framework | tf |
Unrolled Generative Adversarial Networks
Title | Unrolled Generative Adversarial Networks |
Authors | Luke Metz, Ben Poole, David Pfau, Jascha Sohl-Dickstein |
Abstract | We introduce a method to stabilize Generative Adversarial Networks (GANs) by defining the generator objective with respect to an unrolled optimization of the discriminator. This allows training to be adjusted between using the optimal discriminator in the generator’s objective, which is ideal but infeasible in practice, and using the current value of the discriminator, which is often unstable and leads to poor solutions. We show how this technique solves the common problem of mode collapse, stabilizes training of GANs with complex recurrent generators, and increases diversity and coverage of the data distribution by the generator. |
Tasks | |
Published | 2016-11-07 |
URL | http://arxiv.org/abs/1611.02163v4 |
http://arxiv.org/pdf/1611.02163v4.pdf | |
PWC | https://paperswithcode.com/paper/unrolled-generative-adversarial-networks |
Repo | https://github.com/alex98chen/testGAN |
Framework | tf |
A Neural Approach to Blind Motion Deblurring
Title | A Neural Approach to Blind Motion Deblurring |
Authors | Ayan Chakrabarti |
Abstract | We present a new method for blind motion deblurring that uses a neural network trained to compute estimates of sharp image patches from observations that are blurred by an unknown motion kernel. Instead of regressing directly to patch intensities, this network learns to predict the complex Fourier coefficients of a deconvolution filter to be applied to the input patch for restoration. For inference, we apply the network independently to all overlapping patches in the observed image, and average its outputs to form an initial estimate of the sharp image. We then explicitly estimate a single global blur kernel by relating this estimate to the observed image, and finally perform non-blind deconvolution with this kernel. Our method exhibits accuracy and robustness close to state-of-the-art iterative methods, while being much faster when parallelized on GPU hardware. |
Tasks | Deblurring |
Published | 2016-03-15 |
URL | http://arxiv.org/abs/1603.04771v2 |
http://arxiv.org/pdf/1603.04771v2.pdf | |
PWC | https://paperswithcode.com/paper/a-neural-approach-to-blind-motion-deblurring |
Repo | https://github.com/ayanc/ndeblur |
Framework | none |
YouTube-8M: A Large-Scale Video Classification Benchmark
Title | YouTube-8M: A Large-Scale Video Classification Benchmark |
Authors | Sami Abu-El-Haija, Nisarg Kothari, Joonseok Lee, Paul Natsev, George Toderici, Balakrishnan Varadarajan, Sudheendra Vijayanarasimhan |
Abstract | Many recent advancements in Computer Vision are attributed to large datasets. Open-source software packages for Machine Learning and inexpensive commodity hardware have reduced the barrier of entry for exploring novel approaches at scale. It is possible to train models over millions of examples within a few days. Although large-scale datasets exist for image understanding, such as ImageNet, there are no comparable size video classification datasets. In this paper, we introduce YouTube-8M, the largest multi-label video classification dataset, composed of ~8 million videos (500K hours of video), annotated with a vocabulary of 4800 visual entities. To get the videos and their labels, we used a YouTube video annotation system, which labels videos with their main topics. While the labels are machine-generated, they have high-precision and are derived from a variety of human-based signals including metadata and query click signals. We filtered the video labels (Knowledge Graph entities) using both automated and manual curation strategies, including asking human raters if the labels are visually recognizable. Then, we decoded each video at one-frame-per-second, and used a Deep CNN pre-trained on ImageNet to extract the hidden representation immediately prior to the classification layer. Finally, we compressed the frame features and make both the features and video-level labels available for download. We trained various (modest) classification models on the dataset, evaluated them using popular evaluation metrics, and report them as baselines. Despite the size of the dataset, some of our models train to convergence in less than a day on a single machine using TensorFlow. We plan to release code for training a TensorFlow model and for computing metrics. |
Tasks | Action Recognition In Videos, Video Classification |
Published | 2016-09-27 |
URL | http://arxiv.org/abs/1609.08675v1 |
http://arxiv.org/pdf/1609.08675v1.pdf | |
PWC | https://paperswithcode.com/paper/youtube-8m-a-large-scale-video-classification |
Repo | https://github.com/taufikxu/youtube |
Framework | tf |
Deep Reinforcement Learning Radio Control and Signal Detection with KeRLym, a Gym RL Agent
Title | Deep Reinforcement Learning Radio Control and Signal Detection with KeRLym, a Gym RL Agent |
Authors | Timothy J. O’Shea, T. Charles Clancy |
Abstract | This paper presents research in progress investigating the viability and adaptation of reinforcement learning using deep neural network based function approximation for the task of radio control and signal detection in the wireless domain. We demonstrate a successful initial method for radio control which allows naive learning of search without the need for expert features, heuristics, or search strategies. We also introduce Kerlym, an open Keras based reinforcement learning agent collection for OpenAI’s Gym. |
Tasks | |
Published | 2016-05-30 |
URL | http://arxiv.org/abs/1605.09221v1 |
http://arxiv.org/pdf/1605.09221v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-reinforcement-learning-radio-control-and |
Repo | https://github.com/osh/kerlym |
Framework | tf |
Multilingual Part-of-Speech Tagging with Bidirectional Long Short-Term Memory Models and Auxiliary Loss
Title | Multilingual Part-of-Speech Tagging with Bidirectional Long Short-Term Memory Models and Auxiliary Loss |
Authors | Barbara Plank, Anders Søgaard, Yoav Goldberg |
Abstract | Bidirectional long short-term memory (bi-LSTM) networks have recently proven successful for various NLP sequence modeling tasks, but little is known about their reliance to input representations, target languages, data set size, and label noise. We address these issues and evaluate bi-LSTMs with word, character, and unicode byte embeddings for POS tagging. We compare bi-LSTMs to traditional POS taggers across languages and data sizes. We also present a novel bi-LSTM model, which combines the POS tagging loss function with an auxiliary loss function that accounts for rare words. The model obtains state-of-the-art performance across 22 languages, and works especially well for morphologically complex languages. Our analysis suggests that bi-LSTMs are less sensitive to training data size and label corruptions (at small noise levels) than previously assumed. |
Tasks | Part-Of-Speech Tagging |
Published | 2016-04-19 |
URL | http://arxiv.org/abs/1604.05529v3 |
http://arxiv.org/pdf/1604.05529v3.pdf | |
PWC | https://paperswithcode.com/paper/multilingual-part-of-speech-tagging-with |
Repo | https://github.com/timerstime/SDG4DA |
Framework | tf |
Recurrent switching linear dynamical systems
Title | Recurrent switching linear dynamical systems |
Authors | Scott W. Linderman, Andrew C. Miller, Ryan P. Adams, David M. Blei, Liam Paninski, Matthew J. Johnson |
Abstract | Many natural systems, such as neurons firing in the brain or basketball teams traversing a court, give rise to time series data with complex, nonlinear dynamics. We can gain insight into these systems by decomposing the data into segments that are each explained by simpler dynamic units. Building on switching linear dynamical systems (SLDS), we present a new model class that not only discovers these dynamical units, but also explains how their switching behavior depends on observations or continuous latent states. These “recurrent” switching linear dynamical systems provide further insight by discovering the conditions under which each unit is deployed, something that traditional SLDS models fail to do. We leverage recent algorithmic advances in approximate inference to make Bayesian inference in these models easy, fast, and scalable. |
Tasks | Bayesian Inference, Time Series |
Published | 2016-10-26 |
URL | http://arxiv.org/abs/1610.08466v1 |
http://arxiv.org/pdf/1610.08466v1.pdf | |
PWC | https://paperswithcode.com/paper/recurrent-switching-linear-dynamical-systems |
Repo | https://github.com/slinderman/pypolyagamma |
Framework | none |
Post Training in Deep Learning with Last Kernel
Title | Post Training in Deep Learning with Last Kernel |
Authors | Thomas Moreau, Julien Audiffren |
Abstract | One of the main challenges of deep learning methods is the choice of an appropriate training strategy. In particular, additional steps, such as unsupervised pre-training, have been shown to greatly improve the performances of deep structures. In this article, we propose an extra training step, called post-training, which only optimizes the last layer of the network. We show that this procedure can be analyzed in the context of kernel theory, with the first layers computing an embedding of the data and the last layer a statistical model to solve the task based on this embedding. This step makes sure that the embedding, or representation, of the data is used in the best possible way for the considered task. This idea is then tested on multiple architectures with various data sets, showing that it consistently provides a boost in performance. |
Tasks | |
Published | 2016-11-14 |
URL | http://arxiv.org/abs/1611.04499v2 |
http://arxiv.org/pdf/1611.04499v2.pdf | |
PWC | https://paperswithcode.com/paper/post-training-in-deep-learning-with-last |
Repo | https://github.com/tomMoral/post_training |
Framework | tf |
Ultimate tensorization: compressing convolutional and FC layers alike
Title | Ultimate tensorization: compressing convolutional and FC layers alike |
Authors | Timur Garipov, Dmitry Podoprikhin, Alexander Novikov, Dmitry Vetrov |
Abstract | Convolutional neural networks excel in image recognition tasks, but this comes at the cost of high computational and memory complexity. To tackle this problem, [1] developed a tensor factorization framework to compress fully-connected layers. In this paper, we focus on compressing convolutional layers. We show that while the direct application of the tensor framework [1] to the 4-dimensional kernel of convolution does compress the layer, we can do better. We reshape the convolutional kernel into a tensor of higher order and factorize it. We combine the proposed approach with the previous work to compress both convolutional and fully-connected layers of a network and achieve 80x network compression rate with 1.1% accuracy drop on the CIFAR-10 dataset. |
Tasks | |
Published | 2016-11-10 |
URL | http://arxiv.org/abs/1611.03214v1 |
http://arxiv.org/pdf/1611.03214v1.pdf | |
PWC | https://paperswithcode.com/paper/ultimate-tensorization-compressing |
Repo | https://github.com/timgaripov/TensorNet-TF |
Framework | tf |
Sample Efficient Actor-Critic with Experience Replay
Title | Sample Efficient Actor-Critic with Experience Replay |
Authors | Ziyu Wang, Victor Bapst, Nicolas Heess, Volodymyr Mnih, Remi Munos, Koray Kavukcuoglu, Nando de Freitas |
Abstract | This paper presents an actor-critic deep reinforcement learning agent with experience replay that is stable, sample efficient, and performs remarkably well on challenging environments, including the discrete 57-game Atari domain and several continuous control problems. To achieve this, the paper introduces several innovations, including truncated importance sampling with bias correction, stochastic dueling network architectures, and a new trust region policy optimization method. |
Tasks | Continuous Control |
Published | 2016-11-03 |
URL | http://arxiv.org/abs/1611.01224v2 |
http://arxiv.org/pdf/1611.01224v2.pdf | |
PWC | https://paperswithcode.com/paper/sample-efficient-actor-critic-with-experience |
Repo | https://github.com/neilsgp/RL-Algorithms |
Framework | none |
Multi-layer Representation Learning for Medical Concepts
Title | Multi-layer Representation Learning for Medical Concepts |
Authors | Edward Choi, Mohammad Taha Bahadori, Elizabeth Searles, Catherine Coffey, Jimeng Sun |
Abstract | Learning efficient representations for concepts has been proven to be an important basis for many applications such as machine translation or document classification. Proper representations of medical concepts such as diagnosis, medication, procedure codes and visits will have broad applications in healthcare analytics. However, in Electronic Health Records (EHR) the visit sequences of patients include multiple concepts (diagnosis, procedure, and medication codes) per visit. This structure provides two types of relational information, namely sequential order of visits and co-occurrence of the codes within each visit. In this work, we propose Med2Vec, which not only learns distributed representations for both medical codes and visits from a large EHR dataset with over 3 million visits, but also allows us to interpret the learned representations confirmed positively by clinical experts. In the experiments, Med2Vec displays significant improvement in key medical applications compared to popular baselines such as Skip-gram, GloVe and stacked autoencoder, while providing clinically meaningful interpretation. |
Tasks | Document Classification, Machine Translation, Medical Diagnosis, Representation Learning |
Published | 2016-02-17 |
URL | http://arxiv.org/abs/1602.05568v1 |
http://arxiv.org/pdf/1602.05568v1.pdf | |
PWC | https://paperswithcode.com/paper/multi-layer-representation-learning-for |
Repo | https://github.com/mp2893/med2vec |
Framework | none |
PVANet: Lightweight Deep Neural Networks for Real-time Object Detection
Title | PVANet: Lightweight Deep Neural Networks for Real-time Object Detection |
Authors | Sanghoon Hong, Byungseok Roh, Kye-Hyeon Kim, Yeongjae Cheon, Minje Park |
Abstract | In object detection, reducing computational cost is as important as improving accuracy for most practical usages. This paper proposes a novel network structure, which is an order of magnitude lighter than other state-of-the-art networks while maintaining the accuracy. Based on the basic principle of more layers with less channels, this new deep neural network minimizes its redundancy by adopting recent innovations including C.ReLU and Inception structure. We also show that this network can be trained efficiently to achieve solid results on well-known object detection benchmarks: 84.9% and 84.2% mAP on VOC2007 and VOC2012 while the required compute is less than 10% of the recent ResNet-101. |
Tasks | Object Detection, Real-Time Object Detection |
Published | 2016-11-23 |
URL | http://arxiv.org/abs/1611.08588v2 |
http://arxiv.org/pdf/1611.08588v2.pdf | |
PWC | https://paperswithcode.com/paper/pvanet-lightweight-deep-neural-networks-for |
Repo | https://github.com/busyboxs/Some-resources-useful-for-me |
Framework | tf |
Relief R-CNN : Utilizing Convolutional Features for Fast Object Detection
Title | Relief R-CNN : Utilizing Convolutional Features for Fast Object Detection |
Authors | Guiying Li, Junlong Liu, Chunhui Jiang, Liangpeng Zhang, Minlong Lin, Ke Tang |
Abstract | R-CNN style methods are sorts of the state-of-the-art object detection methods, which consist of region proposal generation and deep CNN classification. However, the proposal generation phase in this paradigm is usually time consuming, which would slow down the whole detection time in testing. This paper suggests that the value discrepancies among features in deep convolutional feature maps contain plenty of useful spatial information, and proposes a simple approach to extract the information for fast region proposal generation in testing. The proposed method, namely Relief R-CNN (R2-CNN), adopts a novel region proposal generator in a trained R-CNN style model. The new generator directly generates proposals from convolutional features by some simple rules, thus resulting in a much faster proposal generation speed and a lower demand of computation resources. Empirical studies show that R2-CNN could achieve the fastest detection speed with comparable accuracy among all the compared algorithms in testing. |
Tasks | Object Detection, Real-Time Object Detection |
Published | 2016-01-25 |
URL | http://arxiv.org/abs/1601.06719v4 |
http://arxiv.org/pdf/1601.06719v4.pdf | |
PWC | https://paperswithcode.com/paper/relief-r-cnn-utilizing-convolutional-features |
Repo | https://github.com/IdiosyncraticDragon/relief_rcnn |
Framework | none |
Predictive Business Process Monitoring with LSTM Neural Networks
Title | Predictive Business Process Monitoring with LSTM Neural Networks |
Authors | Niek Tax, Ilya Verenich, Marcello La Rosa, Marlon Dumas |
Abstract | Predictive business process monitoring methods exploit logs of completed cases of a process in order to make predictions about running cases thereof. Existing methods in this space are tailor-made for specific prediction tasks. Moreover, their relative accuracy is highly sensitive to the dataset at hand, thus requiring users to engage in trial-and-error and tuning when applying them in a specific setting. This paper investigates Long Short-Term Memory (LSTM) neural networks as an approach to build consistently accurate models for a wide range of predictive process monitoring tasks. First, we show that LSTMs outperform existing techniques to predict the next event of a running case and its timestamp. Next, we show how to use models for predicting the next task in order to predict the full continuation of a running case. Finally, we apply the same approach to predict the remaining time, and show that this approach outperforms existing tailor-made methods. |
Tasks | Multivariate Time Series Forecasting, Time Series Prediction |
Published | 2016-12-07 |
URL | http://arxiv.org/abs/1612.02130v2 |
http://arxiv.org/pdf/1612.02130v2.pdf | |
PWC | https://paperswithcode.com/paper/predictive-business-process-monitoring-with |
Repo | https://github.com/verenich/ProcessSequencePrediction |
Framework | none |