May 7, 2019

2909 words 14 mins read

Paper Group AWR 82

EnhanceNet: Single Image Super-Resolution Through Automated Texture Synthesis. Parallel Texts in the Hebrew Bible, New Methods and Visualizations. The emotional arcs of stories are dominated by six basic shapes. An Inexact Variable Metric Proximal Point Algorithm for Generic Quasi-Newton Acceleration. DelugeNets: Deep Networks with Efficient and Fl …

EnhanceNet: Single Image Super-Resolution Through Automated Texture Synthesis


Title	EnhanceNet: Single Image Super-Resolution Through Automated Texture Synthesis
Authors	Mehdi S. M. Sajjadi, Bernhard Schölkopf, Michael Hirsch
Abstract	Single image super-resolution is the task of inferring a high-resolution image from a single low-resolution input. Traditionally, the performance of algorithms for this task is measured using pixel-wise reconstruction measures such as peak signal-to-noise ratio (PSNR) which have been shown to correlate poorly with the human perception of image quality. As a result, algorithms minimizing these metrics tend to produce over-smoothed images that lack high-frequency textures and do not look natural despite yielding high PSNR values. We propose a novel application of automated texture synthesis in combination with a perceptual loss focusing on creating realistic textures rather than optimizing for a pixel-accurate reproduction of ground truth images during training. By using feed-forward fully convolutional neural networks in an adversarial training setting, we achieve a significant boost in image quality at high magnification ratios. Extensive experiments on a number of datasets show the effectiveness of our approach, yielding state-of-the-art results in both quantitative and qualitative benchmarks.
Tasks	Image Super-Resolution, Super-Resolution, Texture Synthesis
Published	2016-12-23
URL	http://arxiv.org/abs/1612.07919v2
PDF	http://arxiv.org/pdf/1612.07919v2.pdf
PWC	https://paperswithcode.com/paper/enhancenet-single-image-super-resolution
Repo	https://github.com/geonm/EnhanceNet-Tensorflow
Framework	tf

Parallel Texts in the Hebrew Bible, New Methods and Visualizations


Title	Parallel Texts in the Hebrew Bible, New Methods and Visualizations
Authors	Martijn Naaijer, Dirk Roorda
Abstract	In this article we develop an algorithm to detect parallel texts in the Masoretic Text of the Hebrew Bible. The results are presented online and chapters in the Hebrew Bible containing parallel passages can be inspected synoptically. Differences between parallel passages are highlighted. In a similar way the MT of Isaiah is presented synoptically with 1QIsaa. We also investigate how one can investigate the degree of similarity between parallel passages with the help of a case study of 2 Kings 19-25 and its parallels in Isaiah, Jeremiah and 2 Chronicles.
Tasks
Published	2016-03-04
URL	http://arxiv.org/abs/1603.01541v1
PDF	http://arxiv.org/pdf/1603.01541v1.pdf
PWC	https://paperswithcode.com/paper/parallel-texts-in-the-hebrew-bible-new
Repo	https://github.com/Dans-labs/text-fabric
Framework	tf

The emotional arcs of stories are dominated by six basic shapes


Title	The emotional arcs of stories are dominated by six basic shapes
Authors	Andrew J. Reagan, Lewis Mitchell, Dilan Kiley, Christopher M. Danforth, Peter Sheridan Dodds
Abstract	Advances in computing power, natural language processing, and digitization of text now make it possible to study a culture’s evolution through its texts using a “big data” lens. Our ability to communicate relies in part upon a shared emotional experience, with stories often following distinct emotional trajectories and forming patterns that are meaningful to us. Here, by classifying the emotional arcs for a filtered subset of 1,327 stories from Project Gutenberg’s fiction collection, we find a set of six core emotional arcs which form the essential building blocks of complex emotional trajectories. We strengthen our findings by separately applying Matrix decomposition, supervised learning, and unsupervised learning. For each of these six core emotional arcs, we examine the closest characteristic stories in publication today and find that particular emotional arcs enjoy greater success, as measured by downloads.
Tasks
Published	2016-06-24
URL	http://arxiv.org/abs/1606.07772v3
PDF	http://arxiv.org/pdf/1606.07772v3.pdf
PWC	https://paperswithcode.com/paper/the-emotional-arcs-of-stories-are-dominated
Repo	https://github.com/jonnyjohnson1/emo-arcs
Framework	none

An Inexact Variable Metric Proximal Point Algorithm for Generic Quasi-Newton Acceleration


Title	An Inexact Variable Metric Proximal Point Algorithm for Generic Quasi-Newton Acceleration
Authors	Hongzhou Lin, Julien Mairal, Zaid Harchaoui
Abstract	We propose an inexact variable-metric proximal point algorithm to accelerate gradient-based optimization algorithms. The proposed scheme, called QNing can be notably applied to incremental first-order methods such as the stochastic variance-reduced gradient descent algorithm (SVRG) and other randomized incremental optimization algorithms. QNing is also compatible with composite objectives, meaning that it has the ability to provide exactly sparse solutions when the objective involves a sparsity-inducing regularization. When combined with limited-memory BFGS rules, QNing is particularly effective to solve high-dimensional optimization problems, while enjoying a worst-case linear convergence rate for strongly convex problems. We present experimental results where QNing gives significant improvements over competing methods for training machine learning methods on large samples and in high dimensions.
Tasks
Published	2016-10-04
URL	http://arxiv.org/abs/1610.00960v4
PDF	http://arxiv.org/pdf/1610.00960v4.pdf
PWC	https://paperswithcode.com/paper/an-inexact-variable-metric-proximal-point
Repo	https://github.com/hongzhoulin89/Catalyst-QNing
Framework	none

DelugeNets: Deep Networks with Efficient and Flexible Cross-layer Information Inflows


Title	DelugeNets: Deep Networks with Efficient and Flexible Cross-layer Information Inflows
Authors	Jason Kuen, Xiangfei Kong, Gang Wang, Yap-Peng Tan
Abstract	Deluge Networks (DelugeNets) are deep neural networks which efficiently facilitate massive cross-layer information inflows from preceding layers to succeeding layers. The connections between layers in DelugeNets are established through cross-layer depthwise convolutional layers with learnable filters, acting as a flexible yet efficient selection mechanism. DelugeNets can propagate information across many layers with greater flexibility and utilize network parameters more effectively compared to ResNets, whilst being more efficient than DenseNets. Remarkably, a DelugeNet model with just model complexity of 4.31 GigaFLOPs and 20.2M network parameters, achieve classification errors of 3.76% and 19.02% on CIFAR-10 and CIFAR-100 dataset respectively. Moreover, DelugeNet-122 performs competitively to ResNet-200 on ImageNet dataset, despite costing merely half of the computations needed by the latter.
Tasks
Published	2016-11-17
URL	http://arxiv.org/abs/1611.05552v5
PDF	http://arxiv.org/pdf/1611.05552v5.pdf
PWC	https://paperswithcode.com/paper/delugenets-deep-networks-with-efficient-and
Repo	https://github.com/xternalz/DelugeNets
Framework	torch

Keystroke dynamics as signal for shallow syntactic parsing


Title	Keystroke dynamics as signal for shallow syntactic parsing
Authors	Barbara Plank
Abstract	Keystroke dynamics have been extensively used in psycholinguistic and writing research to gain insights into cognitive processing. But do keystroke logs contain actual signal that can be used to learn better natural language processing models? We postulate that keystroke dynamics contain information about syntactic structure that can inform shallow syntactic parsing. To test this hypothesis, we explore labels derived from keystroke logs as auxiliary task in a multi-task bidirectional Long Short-Term Memory (bi-LSTM). Our results show promising results on two shallow syntactic parsing tasks, chunking and CCG supertagging. Our model is simple, has the advantage that data can come from distinct sources, and produces models that are significantly better than models trained on the text annotations alone.
Tasks	CCG Supertagging, Chunking
Published	2016-10-11
URL	http://arxiv.org/abs/1610.03321v1
PDF	http://arxiv.org/pdf/1610.03321v1.pdf
PWC	https://paperswithcode.com/paper/keystroke-dynamics-as-signal-for-shallow
Repo	https://github.com/bplank/coling2016ks
Framework	none

Network learning via multi-agent inverse transportation problems


Title	Network learning via multi-agent inverse transportation problems
Authors	Susan Jia Xu, Mehdi Nourinejad, Xuebo Lai, Joseph Y. J. Chow
Abstract	Despite the ubiquity of transportation data, methods to infer the state parameters of a network either ignore sensitivity of route decisions, require route enumeration for parameterizing descriptive models of route selection, or require complex bilevel models of route assignment behavior. These limitations prevent modelers from fully exploiting ubiquitous data in monitoring transportation networks. Inverse optimization methods that capture network route choice behavior can address this gap, but they are designed to take observations of the same model to learn the parameters of that model, which is statistically inefficient (e.g. requires estimating population route and link flows). New inverse optimization models and supporting algorithms are proposed to learn the parameters of heterogeneous travelers’ route behavior to infer shared network state parameters (e.g. link capacity dual prices). The inferred values are consistent with observations of each agent’s optimization behavior. We prove that the method can obtain unique dual prices for a network shared by these agents in polynomial time. Four experiments are conducted. The first one, conducted on a 4-node network, verifies the methodology to obtain heterogeneous link cost parameters even when multinomial or mixed logit models would not be meaningfully estimated. The second is a parameter recovery test on the Nguyen-Dupuis network that shows that unique latent link capacity dual prices can be inferred using the proposed method. The third test on the same network demonstrates how a monitoring system in an online learning environment can be designed using this method. The last test demonstrates this learning on real data obtained from a freeway network in Queens, New York, using only real-time Google Maps queries.
Tasks
Published	2016-09-14
URL	http://arxiv.org/abs/1609.04117v4
PDF	http://arxiv.org/pdf/1609.04117v4.pdf
PWC	https://paperswithcode.com/paper/network-learning-via-multi-agent-inverse
Repo	https://github.com/BUILTNYU/Network-learning-via-multi-agent-inverse-transportation-problems
Framework	none

Random Feature Expansions for Deep Gaussian Processes


Title	Random Feature Expansions for Deep Gaussian Processes
Authors	Kurt Cutajar, Edwin V. Bonilla, Pietro Michiardi, Maurizio Filippone
Abstract	The composition of multiple Gaussian Processes as a Deep Gaussian Process (DGP) enables a deep probabilistic nonparametric approach to flexibly tackle complex machine learning problems with sound quantification of uncertainty. Existing inference approaches for DGP models have limited scalability and are notoriously cumbersome to construct. In this work, we introduce a novel formulation of DGPs based on random feature expansions that we train using stochastic variational inference. This yields a practical learning framework which significantly advances the state-of-the-art in inference for DGPs, and enables accurate quantification of uncertainty. We extensively showcase the scalability and performance of our proposal on several datasets with up to 8 million observations, and various DGP architectures with up to 30 hidden layers.
Tasks	Gaussian Processes
Published	2016-10-14
URL	http://arxiv.org/abs/1610.04386v2
PDF	http://arxiv.org/pdf/1610.04386v2.pdf
PWC	https://paperswithcode.com/paper/random-feature-expansions-for-deep-gaussian
Repo	https://github.com/mauriziofilippone/deep_gp_random_features
Framework	tf

Deep Spatio-Temporal Residual Networks for Citywide Crowd Flows Prediction


Title	Deep Spatio-Temporal Residual Networks for Citywide Crowd Flows Prediction
Authors	Junbo Zhang, Yu Zheng, Dekang Qi
Abstract	Forecasting the flow of crowds is of great importance to traffic management and public safety, yet a very challenging task affected by many complex factors, such as inter-region traffic, events and weather. In this paper, we propose a deep-learning-based approach, called ST-ResNet, to collectively forecast the in-flow and out-flow of crowds in each and every region through a city. We design an end-to-end structure of ST-ResNet based on unique properties of spatio-temporal data. More specifically, we employ the framework of the residual neural networks to model the temporal closeness, period, and trend properties of the crowd traffic, respectively. For each property, we design a branch of residual convolutional units, each of which models the spatial properties of the crowd traffic. ST-ResNet learns to dynamically aggregate the output of the three residual neural networks based on data, assigning different weights to different branches and regions. The aggregation is further combined with external factors, such as weather and day of the week, to predict the final traffic of crowds in each and every region. We evaluate ST-ResNet based on two types of crowd flows in Beijing and NYC, finding that its performance exceeds six well-know methods.
Tasks	Crowd Flows Prediction
Published	2016-10-01
URL	http://arxiv.org/abs/1610.00081v2
PDF	http://arxiv.org/pdf/1610.00081v2.pdf
PWC	https://paperswithcode.com/paper/deep-spatio-temporal-residual-networks-for
Repo	https://github.com/BruceBinBoxing/ST-ResNet-Pytorch
Framework	pytorch

CAD2RL: Real Single-Image Flight without a Single Real Image


Title	CAD2RL: Real Single-Image Flight without a Single Real Image
Authors	Fereshteh Sadeghi, Sergey Levine
Abstract	Deep reinforcement learning has emerged as a promising and powerful technique for automatically acquiring control policies that can process raw sensory inputs, such as images, and perform complex behaviors. However, extending deep RL to real-world robotic tasks has proven challenging, particularly in safety-critical domains such as autonomous flight, where a trial-and-error learning process is often impractical. In this paper, we explore the following question: can we train vision-based navigation policies entirely in simulation, and then transfer them into the real world to achieve real-world flight without a single real training image? We propose a learning method that we call CAD$^2$RL, which can be used to perform collision-free indoor flight in the real world while being trained entirely on 3D CAD models. Our method uses single RGB images from a monocular camera, without needing to explicitly reconstruct the 3D geometry of the environment or perform explicit motion planning. Our learned collision avoidance policy is represented by a deep convolutional neural network that directly processes raw monocular images and outputs velocity commands. This policy is trained entirely on simulated images, with a Monte Carlo policy evaluation algorithm that directly optimizes the network’s ability to produce collision-free flight. By highly randomizing the rendering settings for our simulated training set, we show that we can train a policy that generalizes to the real world, without requiring the simulator to be particularly realistic or high-fidelity. We evaluate our method by flying a real quadrotor through indoor environments, and further evaluate the design choices in our simulator through a series of ablation studies on depth prediction. For supplementary video see: https://youtu.be/nXBWmzFrj5s
Tasks	Depth Estimation, Motion Planning
Published	2016-11-13
URL	http://arxiv.org/abs/1611.04201v4
PDF	http://arxiv.org/pdf/1611.04201v4.pdf
PWC	https://paperswithcode.com/paper/cad2rl-real-single-image-flight-without-a
Repo	https://github.com/abefetterman/hamstir-gym
Framework	tf

Generative Adversarial Text to Image Synthesis


Title	Generative Adversarial Text to Image Synthesis
Authors	Scott Reed, Zeynep Akata, Xinchen Yan, Lajanugen Logeswaran, Bernt Schiele, Honglak Lee
Abstract	Automatic synthesis of realistic images from text would be interesting and useful, but current AI systems are still far from this goal. However, in recent years generic and powerful recurrent neural network architectures have been developed to learn discriminative text feature representations. Meanwhile, deep convolutional generative adversarial networks (GANs) have begun to generate highly compelling images of specific categories, such as faces, album covers, and room interiors. In this work, we develop a novel deep architecture and GAN formulation to effectively bridge these advances in text and image model- ing, translating visual concepts from characters to pixels. We demonstrate the capability of our model to generate plausible images of birds and flowers from detailed text descriptions.
Tasks	Adversarial Text, Image Generation, Text-to-Image Generation
Published	2016-05-17
URL	http://arxiv.org/abs/1605.05396v2
PDF	http://arxiv.org/pdf/1605.05396v2.pdf
PWC	https://paperswithcode.com/paper/generative-adversarial-text-to-image
Repo	https://github.com/rafiahmed40/stack-adverserial-network
Framework	tf

DeepBach: a Steerable Model for Bach Chorales Generation


Title	DeepBach: a Steerable Model for Bach Chorales Generation
Authors	Gaëtan Hadjeres, François Pachet, Frank Nielsen
Abstract	This paper introduces DeepBach, a graphical model aimed at modeling polyphonic music and specifically hymn-like pieces. We claim that, after being trained on the chorale harmonizations by Johann Sebastian Bach, our model is capable of generating highly convincing chorales in the style of Bach. DeepBach’s strength comes from the use of pseudo-Gibbs sampling coupled with an adapted representation of musical data. This is in contrast with many automatic music composition approaches which tend to compose music sequentially. Our model is also steerable in the sense that a user can constrain the generation by imposing positional constraints such as notes, rhythms or cadences in the generated score. We also provide a plugin on top of the MuseScore music editor making the interaction with DeepBach easy to use.
Tasks
Published	2016-12-03
URL	http://arxiv.org/abs/1612.01010v2
PDF	http://arxiv.org/pdf/1612.01010v2.pdf
PWC	https://paperswithcode.com/paper/deepbach-a-steerable-model-for-bach-chorales
Repo	https://github.com/ronggong/deepBach-code-explain
Framework	none

San Francisco Crime Classification


Title	San Francisco Crime Classification
Authors	Yehya Abouelnaga
Abstract	San Francisco Crime Classification is an online competition administered by Kaggle Inc. The competition aims at predicting the future crimes based on a given set of geographical and time-based features. In this paper, I achieved a an accuracy that ranks at top %18, as of May 19th, 2016. I will explore the data, and explain in details the tools I used to achieve that result.
Tasks	Crime Prediction
Published	2016-07-13
URL	http://arxiv.org/abs/1607.03626v1
PDF	http://arxiv.org/pdf/1607.03626v1.pdf
PWC	https://paperswithcode.com/paper/san-francisco-crime-classification
Repo	https://github.com/PranotiDesai/San-Francisco-Crime-Classification
Framework	none

De-identification of Patient Notes with Recurrent Neural Networks


Title	De-identification of Patient Notes with Recurrent Neural Networks
Authors	Franck Dernoncourt, Ji Young Lee, Ozlem Uzuner, Peter Szolovits
Abstract	Objective: Patient notes in electronic health records (EHRs) may contain critical information for medical investigations. However, the vast majority of medical investigators can only access de-identified notes, in order to protect the confidentiality of patients. In the United States, the Health Insurance Portability and Accountability Act (HIPAA) defines 18 types of protected health information (PHI) that needs to be removed to de-identify patient notes. Manual de-identification is impractical given the size of EHR databases, the limited number of researchers with access to the non-de-identified notes, and the frequent mistakes of human annotators. A reliable automated de-identification system would consequently be of high value. Materials and Methods: We introduce the first de-identification system based on artificial neural networks (ANNs), which requires no handcrafted features or rules, unlike existing systems. We compare the performance of the system with state-of-the-art systems on two datasets: the i2b2 2014 de-identification challenge dataset, which is the largest publicly available de-identification dataset, and the MIMIC de-identification dataset, which we assembled and is twice as large as the i2b2 2014 dataset. Results: Our ANN model outperforms the state-of-the-art systems. It yields an F1-score of 97.85 on the i2b2 2014 dataset, with a recall 97.38 and a precision of 97.32, and an F1-score of 99.23 on the MIMIC de-identification dataset, with a recall 99.25 and a precision of 99.06. Conclusion: Our findings support the use of ANNs for de-identification of patient notes, as they show better performance than previously published systems while requiring no feature engineering.
Tasks	Feature Engineering
Published	2016-06-10
URL	http://arxiv.org/abs/1606.03475v1
PDF	http://arxiv.org/pdf/1606.03475v1.pdf
PWC	https://paperswithcode.com/paper/de-identification-of-patient-notes-with
Repo	https://github.com/Franck-Dernoncourt/NeuroNER
Framework	tf

Infrared Colorization Using Deep Convolutional Neural Networks


Title	Infrared Colorization Using Deep Convolutional Neural Networks
Authors	Matthias Limmer, Hendrik P. A. Lensch
Abstract	This paper proposes a method for transferring the RGB color spectrum to near-infrared (NIR) images using deep multi-scale convolutional neural networks. A direct and integrated transfer between NIR and RGB pixels is trained. The trained model does not require any user guidance or a reference image database in the recall phase to produce images with a natural appearance. To preserve the rich details of the NIR image, its high frequency features are transferred to the estimated RGB image. The presented approach is trained and evaluated on a real-world dataset containing a large amount of road scene images in summer. The dataset was captured by a multi-CCD NIR/RGB camera, which ensures a perfect pixel to pixel registration.
Tasks	Colorization
Published	2016-04-08
URL	http://arxiv.org/abs/1604.02245v3
PDF	http://arxiv.org/pdf/1604.02245v3.pdf
PWC	https://paperswithcode.com/paper/infrared-colorization-using-deep
Repo	https://github.com/raoniranjan/Infrared-Image-Colorization-using-Deep-Neural-Networks
Framework	none