Paper Group ANR 110
Sequence-to-sequence Pre-training with Data Augmentation for Sentence Rewriting. Robust and Adaptive Planning under Model Uncertainty. 3D Conditional Generative Adversarial Networks to enable large-scale seismic image enhancement. Multiple Hypothesis Tracking Algorithm for Multi-Target Multi-Camera Tracking with Disjoint Views. Machine learning mod …
Sequence-to-sequence Pre-training with Data Augmentation for Sentence Rewriting
Title | Sequence-to-sequence Pre-training with Data Augmentation for Sentence Rewriting |
Authors | Yi Zhang, Tao Ge, Furu Wei, Ming Zhou, Xu Sun |
Abstract | We study sequence-to-sequence (seq2seq) pre-training with data augmentation for sentence rewriting. Instead of training a seq2seq model with gold training data and augmented data simultaneously, we separate them to train in different phases: pre-training with the augmented data and fine-tuning with the gold data. We also introduce multiple data augmentation methods to help model pre-training for sentence rewriting. We evaluate our approach in two typical well-defined sentence rewriting tasks: Grammatical Error Correction (GEC) and Formality Style Transfer (FST). Experiments demonstrate our approach can better utilize augmented data without hurting the model’s trust in gold data and further improve the model’s performance with our proposed data augmentation methods. Our approach substantially advances the state-of-the-art results in well-recognized sentence rewriting benchmarks over both GEC and FST. Specifically, it pushes the CoNLL-2014 benchmark’s $F_{0.5}$ score and JFLEG Test GLEU score to 62.61 and 63.54 in the restricted training setting, 66.77 and 65.22 respectively in the unrestricted setting, and advances GYAFC benchmark’s BLEU to 74.24 (2.23 absolute improvement) in E&M domain and 77.97 (2.64 absolute improvement) in F&R domain. |
Tasks | Data Augmentation, Grammatical Error Correction, Style Transfer |
Published | 2019-09-13 |
URL | https://arxiv.org/abs/1909.06002v2 |
https://arxiv.org/pdf/1909.06002v2.pdf | |
PWC | https://paperswithcode.com/paper/sequence-to-sequence-pre-training-with-data |
Repo | |
Framework | |
Robust and Adaptive Planning under Model Uncertainty
Title | Robust and Adaptive Planning under Model Uncertainty |
Authors | Apoorva Sharma, James Harrison, Matthew Tsao, Marco Pavone |
Abstract | Planning under model uncertainty is a fundamental problem across many applications of decision making and learning. In this paper, we propose the Robust Adaptive Monte Carlo Planning (RAMCP) algorithm, which allows computation of risk-sensitive Bayes-adaptive policies that optimally trade off exploration, exploitation, and robustness. RAMCP formulates the risk-sensitive planning problem as a two-player zero-sum game, in which an adversary perturbs the agent’s belief over the models. We introduce two versions of the RAMCP algorithm. The first, RAMCP-F, converges to an optimal risk-sensitive policy without having to rebuild the search tree as the underlying belief over models is perturbed. The second version, RAMCP-I, improves computational efficiency at the cost of losing theoretical guarantees, but is shown to yield empirical results comparable to RAMCP-F. RAMCP is demonstrated on an n-pull multi-armed bandit problem, as well as a patient treatment scenario. |
Tasks | Decision Making |
Published | 2019-01-09 |
URL | http://arxiv.org/abs/1901.02577v1 |
http://arxiv.org/pdf/1901.02577v1.pdf | |
PWC | https://paperswithcode.com/paper/robust-and-adaptive-planning-under-model |
Repo | |
Framework | |
3D Conditional Generative Adversarial Networks to enable large-scale seismic image enhancement
Title | 3D Conditional Generative Adversarial Networks to enable large-scale seismic image enhancement |
Authors | Praneet Dutta, Bruce Power, Adam Halpert, Carlos Ezequiel, Aravind Subramanian, Chanchal Chatterjee, Sindhu Hari, Kenton Prindle, Vishal Vaddina, Andrew Leach, Raj Domala, Laura Bandura, Massimo Mascaro |
Abstract | We propose GAN-based image enhancement models for frequency enhancement of 2D and 3D seismic images. Seismic imagery is used to understand and characterize the Earth’s subsurface for energy exploration. Because these images often suffer from resolution limitations and noise contamination, our proposed method performs large-scale seismic volume frequency enhancement and denoising. The enhanced images reduce uncertainty and improve decisions about issues, such as optimal well placement, that often rely on low signal-to-noise ratio (SNR) seismic volumes. We explored the impact of adding lithology class information to the models, resulting in improved performance on PSNR and SSIM metrics over a baseline model with no conditional information. |
Tasks | Denoising, Image Enhancement |
Published | 2019-11-16 |
URL | https://arxiv.org/abs/1911.06932v1 |
https://arxiv.org/pdf/1911.06932v1.pdf | |
PWC | https://paperswithcode.com/paper/3d-conditional-generative-adversarial |
Repo | |
Framework | |
Multiple Hypothesis Tracking Algorithm for Multi-Target Multi-Camera Tracking with Disjoint Views
Title | Multiple Hypothesis Tracking Algorithm for Multi-Target Multi-Camera Tracking with Disjoint Views |
Authors | Kwangjin Yoon, Young-min Song, Moongu Jeon |
Abstract | In this study, a multiple hypothesis tracking (MHT) algorithm for multi-target multi-camera tracking (MCT) with disjoint views is proposed. Our method forms track-hypothesis trees, and each branch of them represents a multi-camera track of a target that may move within a camera as well as move across cameras. Furthermore, multi-target tracking within a camera is performed simultaneously with the tree formation by manipulating a status of each track hypothesis. Each status represents three different stages of a multi-camera track: tracking, searching, and end-of-track. The tracking status means targets are tracked by a single camera tracker. In the searching status, the disappeared targets are examined if they reappear in other cameras. The end-of-track status does the target exited the camera network due to its lengthy invisibility. These three status assists MHT to form the track-hypothesis trees for multi-camera tracking. Furthermore, they present a gating technique for eliminating of unlikely observation-to-track association. In the experiments, they evaluate the proposed method using two datasets, DukeMTMC and NLPR-MCT, which demonstrates that the proposed method outperforms the state-of-the-art method in terms of improvement of the accuracy. In addition, they show that the proposed method can operate in real-time and online. |
Tasks | |
Published | 2019-01-25 |
URL | http://arxiv.org/abs/1901.08787v1 |
http://arxiv.org/pdf/1901.08787v1.pdf | |
PWC | https://paperswithcode.com/paper/multiple-hypothesis-tracking-algorithm-for |
Repo | |
Framework | |
Machine learning models show similar performance to Renewables.ninja for generation of long-term wind power time series even without location information
Title | Machine learning models show similar performance to Renewables.ninja for generation of long-term wind power time series even without location information |
Authors | Johann Baumgartner, Katharina Gruber, Sofia Simoes, Yves-Marie Saint-Drenan, Johannes Schmidt |
Abstract | Driven by climatic processes, wind power generation is inherently variable. Long-term simulated wind power time series are therefore an essential component for understanding the temporal availability of wind power and its integration into future renewable energy systems. In the recent past, mainly power curve based models such as Renewables.ninja (RN) have been used for deriving synthetic time series for wind power generation despite their need for accurate location information as well as for bias correction, and their insufficient replication of extreme events and short-term power ramps. We assess how time series generated by machine learning models (MLM) compare to RN in terms of their ability to replicate the characteristics of observed nationally aggregated wind power generation for Germany. Hence, we apply neural networks to one MERRA2 reanalysis wind speed input dataset with no location information and one with basic location information. The resulting time series and the RN time series are compared with actual generation. Both MLM time series feature equal or even better time series quality than RN depending on the characteristics considered. We conclude that MLM models can, even when reducing information on turbine locations and turbine types, produce time series of at least equal quality to RN. |
Tasks | Time Series |
Published | 2019-12-09 |
URL | https://arxiv.org/abs/1912.09426v1 |
https://arxiv.org/pdf/1912.09426v1.pdf | |
PWC | https://paperswithcode.com/paper/machine-learning-models-show-similar |
Repo | |
Framework | |
Graph Attribute Aggregation Network with Progressive Margin Folding
Title | Graph Attribute Aggregation Network with Progressive Margin Folding |
Authors | Penghui Sun, Jingwei Qu, Xiaoqing Lyu, Haibin Ling, Zhi Tang |
Abstract | Graph convolutional neural networks (GCNNs) have been attracting increasing research attention due to its great potential in inference over graph structures. However, insufficient effort has been devoted to the aggregation methods between different convolution graph layers. In this paper, we introduce a graph attribute aggregation network (GAAN) architecture. Different from the conventional pooling operations, a graph-transformation-based aggregation strategy, progressive margin folding, PMF, is proposed for integrating graph features. By distinguishing internal and margin elements, we provide an approach for implementing the folding iteratively. And a mechanism is also devised for preserving the local structures during progressively folding. In addition, a hypergraph-based representation is introduced for transferring the aggregated information between different layers. Our experiments applied to the public molecule datasets demonstrate that the proposed GAAN outperforms the existing GCNN models with significant effectiveness. |
Tasks | |
Published | 2019-05-14 |
URL | https://arxiv.org/abs/1905.05347v1 |
https://arxiv.org/pdf/1905.05347v1.pdf | |
PWC | https://paperswithcode.com/paper/graph-attribute-aggregation-network-with |
Repo | |
Framework | |
Backpropagation Algorithms and Reservoir Computing in Recurrent Neural Networks for the Forecasting of Complex Spatiotemporal Dynamics
Title | Backpropagation Algorithms and Reservoir Computing in Recurrent Neural Networks for the Forecasting of Complex Spatiotemporal Dynamics |
Authors | Pantelis R. Vlachas, Jaideep Pathak, Brian R. Hunt, Themistoklis P. Sapsis, Michelle Girvan, Edward Ott, Petros Koumoutsakos |
Abstract | We examine the efficiency of Recurrent Neural Networks in forecasting the spatiotemporal dynamics of high dimensional and reduced order complex systems using Reservoir Computing (RC) and Backpropagation through time (BPTT) for gated network architectures. We highlight advantages and limitations of each method and discuss their implementation for parallel computing architectures. We quantify the relative prediction accuracy of these algorithms for the longterm forecasting of chaotic systems using as benchmarks the Lorenz-96 and the Kuramoto-Sivashinsky (KS) equations. We find that, when the full state dynamics are available for training, RC outperforms BPTT approaches in terms of predictive performance and in capturing of the long-term statistics, while at the same time requiring much less training time. However, in the case of reduced order data, large scale RC models can be unstable and more likely than the BPTT algorithms to diverge. In contrast, RNNs trained via BPTT show superior forecasting abilities and capture well the dynamics of reduced order systems. Furthermore, the present study quantifies for the first time the Lyapunov Spectrum of the KS equation with BPTT, achieving similar accuracy as RC. This study establishes that RNNs are a potent computational framework for the learning and forecasting of complex spatiotemporal systems. |
Tasks | |
Published | 2019-10-09 |
URL | https://arxiv.org/abs/1910.05266v2 |
https://arxiv.org/pdf/1910.05266v2.pdf | |
PWC | https://paperswithcode.com/paper/forecasting-of-spatio-temporal-chaotic |
Repo | |
Framework | |
A Methodological Review of Visual Road Recognition Procedures for Autonomous Driving Applications
Title | A Methodological Review of Visual Road Recognition Procedures for Autonomous Driving Applications |
Authors | Kai Li Lim, Thomas Bräunl |
Abstract | The current research interest in autonomous driving is growing at a rapid pace, attracting great investments from both the academic and corporate sectors. In order for vehicles to be fully autonomous, it is imperative that the driver assistance system is adapt in road and lane keeping. In this paper, we present a methodological review of techniques with a focus on visual road detection and recognition. We adopt a pragmatic outlook in presenting this review, whereby the procedures of road recognition is emphasised with respect to its practical implementations. The contribution of this review hence covers the topic in two parts – the first part describes the methodological approach to conventional road detection, which covers the algorithms and approaches involved to classify and segregate roads from non-road regions; and the other part focuses on recent state-of-the-art machine learning techniques that are applied to visual road recognition, with an emphasis on methods that incorporate convolutional neural networks and semantic segmentation. A subsequent overview of recent implementations in the commercial sector is also presented, along with some recent research works pertaining to road detections. |
Tasks | Autonomous Driving, Semantic Segmentation |
Published | 2019-05-05 |
URL | https://arxiv.org/abs/1905.01635v1 |
https://arxiv.org/pdf/1905.01635v1.pdf | |
PWC | https://paperswithcode.com/paper/a-methodological-review-of-visual-road |
Repo | |
Framework | |
One-time learning in a biologically-inspired Salience-affected Artificial Neural Network (SANN)
Title | One-time learning in a biologically-inspired Salience-affected Artificial Neural Network (SANN) |
Authors | Leendert A Remmelzwaal, George F R Ellis, Jonathan Tapson |
Abstract | In this paper we introduce a novel Salience Affected Artificial Neural Network (SANN) that models the way neuromodulators such as dopamine and noradrenaline affect neural dynamics in the human brain by being distributed diffusely through neocortical regions. This allows one-time learning to take place through strengthening entire patterns of activation at one go. We present a model that accepts a \textit{salience signal}, and returns a \textit{reverse salience signal}. We demonstrate that we can tag an image with salience with only a single training iteration, and that the same image will then produces the highest reverse salience signal during classification. We explore the effects of salience on learning via its effect on the activation functions of each node, as well as on the strength of weights in the network. We demonstrate that a salience signal improves classification accuracy of the specific image that was tagged with salience, as well as all images in the same class, while penalizing images in other classes. Results are validated using 5-fold validation testing on MNIST and Fashion MNIST datasets. This research serves as a proof of concept, and could be the first step towards introducing salience tagging into Deep Learning Networks and robotics. |
Tasks | |
Published | 2019-08-09 |
URL | https://arxiv.org/abs/1908.03532v4 |
https://arxiv.org/pdf/1908.03532v4.pdf | |
PWC | https://paperswithcode.com/paper/one-time-learning-and-reverse-salience-signal |
Repo | |
Framework | |
On Scalable and Efficient Computation of Large Scale Optimal Transport
Title | On Scalable and Efficient Computation of Large Scale Optimal Transport |
Authors | Yujia Xie, Minshuo Chen, Haoming Jiang, Tuo Zhao, Hongyuan Zha |
Abstract | Optimal Transport (OT) naturally arises in many machine learning applications, yet the heavy computational burden limits its wide-spread uses. To address the scalability issue, we propose an implicit generative learning-based framework called SPOT (Scalable Push-forward of Optimal Transport). Specifically, we approximate the optimal transport plan by a pushforward of a reference distribution, and cast the optimal transport problem into a minimax problem. We then can solve OT problems efficiently using primal dual stochastic gradient-type algorithms. We also show that we can recover the density of the optimal transport plan using neural ordinary differential equations. Numerical experiments on both synthetic and real datasets illustrate that SPOT is robust and has favorable convergence behavior. SPOT also allows us to efficiently sample from the optimal transport plan, which benefits downstream applications such as domain adaptation. |
Tasks | Domain Adaptation |
Published | 2019-05-01 |
URL | https://arxiv.org/abs/1905.00158v3 |
https://arxiv.org/pdf/1905.00158v3.pdf | |
PWC | https://paperswithcode.com/paper/on-scalable-and-efficient-computation-of |
Repo | |
Framework | |
Fuzzy C-Means Clustering and Sonification of HRV Features
Title | Fuzzy C-Means Clustering and Sonification of HRV Features |
Authors | Debanjan Borthakur, Victoria Grace, Paul Batchelor, Harishchandra Dubey, Kunal Mankodiya |
Abstract | Linear and non-linear measures of heart rate variability (HRV) are widely investigated as non-invasive indicators of health. Stress has a profound impact on heart rate, and different meditation techniques have been found to modulate heartbeat rhythm. This paper aims to explore the process of identifying appropriate metrices from HRV analysis for sonification. Sonification is a type of auditory display involving the process of mapping data to acoustic parameters. This work explores the use of auditory display in aiding the analysis of HRV leveraged by unsupervised machine learning techniques. Unsupervised clustering helps select the appropriate features to improve the sonification interpretability. Vocal synthesis sonification techniques are employed to increase comprehension and learnability of the processed data displayed through sound. These analyses are early steps in building a real-time sound-based biofeedback training system. |
Tasks | Heart Rate Variability |
Published | 2019-08-19 |
URL | https://arxiv.org/abs/1908.07107v2 |
https://arxiv.org/pdf/1908.07107v2.pdf | |
PWC | https://paperswithcode.com/paper/fuzzy-c-means-clustering-and-sonification-of |
Repo | |
Framework | |
Raiders of the Lost Art
Title | Raiders of the Lost Art |
Authors | Anthony Bourached, George Cann |
Abstract | Neural style transfer, first proposed by Gatys et al. (2015), can be used to create novel artistic work through rendering a content image in the form of a style image. We present a novel method of reconstructing lost artwork, by applying neural style transfer to x-radiographs of artwork with secondary interior artwork beneath a primary exterior, so as to reconstruct lost artwork. Finally we reflect on AI art exhibitions and discuss the social, cultural, ethical, and philosophical impact of these technical innovations. |
Tasks | Style Transfer |
Published | 2019-09-10 |
URL | https://arxiv.org/abs/1909.05677v1 |
https://arxiv.org/pdf/1909.05677v1.pdf | |
PWC | https://paperswithcode.com/paper/raiders-of-the-lost-art |
Repo | |
Framework | |
AI in Pursuit of Happiness, Finding Only Sadness: Multi-Modal Facial Emotion Recognition Challenge
Title | AI in Pursuit of Happiness, Finding Only Sadness: Multi-Modal Facial Emotion Recognition Challenge |
Authors | Carl Norman |
Abstract | The importance of automated Facial Emotion Recognition (FER) grows the more common human-machine interactions become, which will only continue to increase dramatically with time. A common method to describe human sentiment or feeling is the categorical model the 7 basic emotions', consisting of Angry’, Disgust', Fear’, Happiness', Sadness’, Surprise' and Neutral’. The Emotion Recognition in the Wild' (EmotiW) competition is now in its 7th year and has become the standard benchmark for measuring FER performance. The focus of this paper is the EmotiW sub-challenge of classifying videos in the Acted Facial Expression in the Wild’ (AFEW) dataset, consisting of both visual and audio modalities, into one of the above classes. Machine learning has exploded as a research topic in recent years, with advancements in Deep Learning' a key part of this. Although Deep Learning techniques have been widely applied to the FER task by entrants in previous years, this paper has two main contributions: (i) to apply the latest state-of-the-art’ visual and temporal networks and (ii) exploring various methods of fusing features extracted from the visual and audio elements to enrich the information available to the final model making the prediction. There are a number of complex issues that arise when trying to classify emotions for `in-the-wild’ video sequences, which the above two approaches attempt to directly address. There are some positive findings when comparing the results of this paper to past submissions, indicating that further research into the proposed methods and fine-tuning of the models deployed, could result in another step forwards in the field of automated FER. | |
Tasks | Emotion Recognition |
Published | 2019-10-24 |
URL | https://arxiv.org/abs/1911.05187v1 |
https://arxiv.org/pdf/1911.05187v1.pdf | |
PWC | https://paperswithcode.com/paper/ai-in-pursuit-of-happiness-finding-only |
Repo | |
Framework | |
Make Lead Bias in Your Favor: A Simple and Effective Method for News Summarization
Title | Make Lead Bias in Your Favor: A Simple and Effective Method for News Summarization |
Authors | Chenguang Zhu, Ziyi Yang, Robert Gmyr, Michael Zeng, Xuedong Huang |
Abstract | Lead bias is a common phenomenon in news summarization, where early parts of an article often contain the most salient information. While many algorithms exploit this fact in summary generation, it has a detrimental effect on teaching the model to discriminate and extract important information. We propose that the lead bias can be leveraged in a simple and effective way in our favor to pretrain abstractive news summarization models on large-scale unlabeled corpus: predicting the leading sentences using the rest of an article. Via careful data cleaning and filtering, our transformer-based pretrained model without any finetuning achieves remarkable results over various news summarization tasks. With further finetuning, our model outperforms many competitive baseline models. Human evaluations further show the effectiveness of our method. |
Tasks | |
Published | 2019-12-25 |
URL | https://arxiv.org/abs/1912.11602v2 |
https://arxiv.org/pdf/1912.11602v2.pdf | |
PWC | https://paperswithcode.com/paper/make-lead-bias-in-your-favor-a-simple-and-2 |
Repo | |
Framework | |
Intensity augmentation for domain transfer of whole breast segmentation in MRI
Title | Intensity augmentation for domain transfer of whole breast segmentation in MRI |
Authors | Linde S. Hesse, Grey Kuling, Mitko Veta, Anne L. Martel |
Abstract | The segmentation of the breast from the chest wall is an important first step in the analysis of breast magnetic resonance images. 3D U-nets have been shown to obtain high segmentation accuracy and appear to generalize well when trained on one scanner type and tested on another scanner, provided that a very similar T1-weighted MR protocol is used. There has, however, been little work addressing the problem of domain adaptation when image intensities or patient orientation differ markedly between the training set and an unseen test set. To overcome the domain shift we propose to apply extensive intensity augmentation in addition to geometric augmentation during training. We explored both style transfer and a novel intensity remapping approach as intensity augmentation strategies. For our experiments, we trained a 3D U-net on T1-weighted scans and tested on T2-weighted scans. By applying intensity augmentation we increased segmentation performance from a DSC of 0.71 to 0.90. This performance is very close to the baseline performance of training and testing on T2-weighted scans (0.92). Furthermore, we applied our network to an independent test set made up of publicly available scans acquired using a T1-weighted TWIST sequence and a different coil configuration. On this dataset we obtained a performance of 0.89, close to the inter-observer variability of the ground truth segmentations (0.92). Our results show that using intensity augmentation in addition to geometric augmentation is a suitable method to overcome the intensity domain shift and we expect it to be useful for a wide range of segmentation tasks. |
Tasks | Domain Adaptation, Style Transfer |
Published | 2019-09-05 |
URL | https://arxiv.org/abs/1909.02642v1 |
https://arxiv.org/pdf/1909.02642v1.pdf | |
PWC | https://paperswithcode.com/paper/intensity-augmentation-for-domain-transfer-of |
Repo | |
Framework | |