January 25, 2020

3563 words 17 mins read

Paper Group ANR 1750

Editing Text in the Wild. Automatic Post-Editing for Machine Translation. Using Machine Learning to Assess Short Term Causal Dependence and Infer Network Links. Two-sample Testing Using Deep Learning. Interpreting and Understanding Graph Convolutional Neural Network using Gradient-based Attribution Method. Context-Aware Monolingual Repair for Neura …

Editing Text in the Wild


Title	Editing Text in the Wild
Authors	Liang Wu, Chengquan Zhang, Jiaming Liu, Junyu Han, Jingtuo Liu, Errui Ding, Xiang Bai
Abstract	In this paper, we are interested in editing text in natural images, which aims to replace or modify a word in the source image with another one while maintaining its realistic look. This task is challenging, as the styles of both background and text need to be preserved so that the edited image is visually indistinguishable from the source image. Specifically, we propose an end-to-end trainable style retention network (SRNet) that consists of three modules: text conversion module, background inpainting module and fusion module. The text conversion module changes the text content of the source image into the target text while keeping the original text style. The background inpainting module erases the original text, and fills the text region with appropriate texture. The fusion module combines the information from the two former modules, and generates the edited text images. To our knowledge, this work is the first attempt to edit text in natural images at the word level. Both visual effects and quantitative results on synthetic and real-world dataset (ICDAR 2013) fully confirm the importance and necessity of modular decomposition. We also conduct extensive experiments to validate the usefulness of our method in various real-world applications such as text image synthesis, augmented reality (AR) translation, information hiding, etc.
Tasks	Image Generation
Published	2019-08-08
URL	https://arxiv.org/abs/1908.03047v1
PDF	https://arxiv.org/pdf/1908.03047v1.pdf
PWC	https://paperswithcode.com/paper/editing-text-in-the-wild
Repo
Framework

Automatic Post-Editing for Machine Translation


Title	Automatic Post-Editing for Machine Translation
Authors	Rajen Chatterjee
Abstract	Automatic Post-Editing (APE) aims to correct systematic errors in a machine translated text. This is primarily useful when the machine translation (MT) system is not accessible for improvement, leaving APE as a viable option to improve translation quality as a downstream task - which is the focus of this thesis. This field has received less attention compared to MT due to several reasons, which include: the limited availability of data to perform a sound research, contrasting views reported by different researchers about the effectiveness of APE, and limited attention from the industry to use APE in current production pipelines. In this thesis, we perform a thorough investigation of APE as a downstream task in order to: i) understand its potential to improve translation quality; ii) advance the core technology - starting from classical methods to recent deep-learning based solutions; iii) cope with limited and sparse data; iv) better leverage multiple input sources; v) mitigate the task-specific problem of over-correction; vi) enhance neural decoding to leverage external knowledge; and vii) establish an online learning framework to handle data diversity in real-time. All the above contributions are discussed across several chapters, and most of them are evaluated in the APE shared task organized each year at the Conference on Machine Translation. Our efforts in improving the technology resulted in the best system at the 2017 APE shared task, and our work on online learning received a distinguished paper award at the Italian Conference on Computational Linguistics. Overall, outcomes and findings of our work have boost interest among researchers and attracted industries to examine this technology to solve real-word problems.
Tasks	Automatic Post-Editing, Machine Translation
Published	2019-10-18
URL	https://arxiv.org/abs/1910.08592v1
PDF	https://arxiv.org/pdf/1910.08592v1.pdf
PWC	https://paperswithcode.com/paper/automatic-post-editing-for-machine
Repo
Framework

Using Machine Learning to Assess Short Term Causal Dependence and Infer Network Links


Title	Using Machine Learning to Assess Short Term Causal Dependence and Infer Network Links
Authors	Amitava Banerjee, Jaideep Pathak, Rajarshi Roy, Juan G. Restrepo, Edward Ott
Abstract	We introduce and test a general machine-learning-based technique for the inference of short term causal dependence between state variables of an unknown dynamical system from time series measurements of its state variables. Our technique leverages the results of a machine learning process for short time prediction to achieve our goal. The basic idea is to use the machine learning to estimate the elements of the Jacobian matrix of the dynamical flow along an orbit. The type of machine learning that we employ is reservoir computing. We present numerical tests on link inference of a network of interacting dynamical nodes. It is seen that dynamical noise can greatly enhance the effectiveness of our technique, while observational noise degrades the effectiveness. We believe that the competition between these two opposing types of noise will be the key factor determining the success of causal inference in many of the most important application situations.
Tasks	Causal Inference, Time Series
Published	2019-12-05
URL	https://arxiv.org/abs/1912.02721v1
PDF	https://arxiv.org/pdf/1912.02721v1.pdf
PWC	https://paperswithcode.com/paper/using-machine-learning-to-assess-short-term
Repo
Framework

Two-sample Testing Using Deep Learning


Title	Two-sample Testing Using Deep Learning
Authors	Matthias Kirchler, Shahryar Khorasani, Marius Kloft, Christoph Lippert
Abstract	We propose a two-sample testing procedure based on learned deep neural network representations. To this end, we define two test statistics that perform an asymptotic location test on data samples mapped onto a hidden layer. The tests are consistent and asymptotically control the type-1 error rate. Their test statistics can be evaluated in linear time (in the sample size). Suitable data representations are obtained in a data-driven way, by solving a supervised or unsupervised transfer-learning task on an auxiliary (potentially distinct) data set. If no auxiliary data is available, we split the data into two chunks: one for learning representations and one for computing the test statistic. In experiments on audio samples, natural images and three-dimensional neuroimaging data our tests yield significant decreases in type-2 error rate (up to 35 percentage points) compared to state-of-the-art two-sample tests such as kernel-methods and classifier two-sample tests.
Tasks	Transfer Learning
Published	2019-10-14
URL	https://arxiv.org/abs/1910.06239v2
PDF	https://arxiv.org/pdf/1910.06239v2.pdf
PWC	https://paperswithcode.com/paper/two-sample-testing-using-deep-learning
Repo
Framework

Interpreting and Understanding Graph Convolutional Neural Network using Gradient-based Attribution Method


Title	Interpreting and Understanding Graph Convolutional Neural Network using Gradient-based Attribution Method
Authors	Shangsheng Xie, Mingming Lu
Abstract	To solve the problem that convolutional neural networks (CNNs) are difficult to process non-grid type relational data like graphs, Kipf et al. proposed a graph convolutional neural network (GCN). The core idea of the GCN is to perform two-fold informational fusion for each node in a given graph during each iteration: the fusion of graph structure information and the fusion of node feature dimensions. Because of the characteristic of the combinatorial generalizations, GCN has been widely used in the fields of scene semantic relationship analysis, natural language processing and few-shot learning etc. However, due to its two-fold informational fusion involves mathematical irreversible calculations, it is hard to explain the decision reason for the prediction of the each node classification. Unfortunately, most of the existing attribution analysis methods concentrate on the models like CNNs, which are utilized to process grid-like data. It is difficult to apply those analysis methods to the GCN directly. It is because compared with the independence among CNNs input data, there is correlation between the GCN input data. This resulting in the existing attribution analysis methods can only obtain the partial model contribution from the central node features to the final decision of the GCN, but ignores the other model contribution from central node features and its neighbor nodes features to that decision. To this end, we propose a gradient attribution analysis method for the GCN called Node Attribution Method (NAM), which can get the model contribution from not only the central node but also its neighbor nodes to the GCN output. We also propose the Node Importance Visualization (NIV) method to visualize the central node and its neighbor nodes based on the value of the contribution…
Tasks	Few-Shot Learning, Node Classification
Published	2019-03-09
URL	http://arxiv.org/abs/1903.03768v2
PDF	http://arxiv.org/pdf/1903.03768v2.pdf
PWC	https://paperswithcode.com/paper/interpreting-and-understanding-graph
Repo
Framework

Context-Aware Monolingual Repair for Neural Machine Translation


Title	Context-Aware Monolingual Repair for Neural Machine Translation
Authors	Elena Voita, Rico Sennrich, Ivan Titov
Abstract	Modern sentence-level NMT systems often produce plausible translations of isolated sentences. However, when put in context, these translations may end up being inconsistent with each other. We propose a monolingual DocRepair model to correct inconsistencies between sentence-level translations. DocRepair performs automatic post-editing on a sequence of sentence-level translations, refining translations of sentences in context of each other. For training, the DocRepair model requires only monolingual document-level data in the target language. It is trained as a monolingual sequence-to-sequence model that maps inconsistent groups of sentences into consistent ones. The consistent groups come from the original training data; the inconsistent groups are obtained by sampling round-trip translations for each isolated sentence. We show that this approach successfully imitates inconsistencies we aim to fix: using contrastive evaluation, we show large improvements in the translation of several contextual phenomena in an English-Russian translation task, as well as improvements in the BLEU score. We also conduct a human evaluation and show a strong preference of the annotators to corrected translations over the baseline ones. Moreover, we analyze which discourse phenomena are hard to capture using monolingual data only.
Tasks	Automatic Post-Editing, Machine Translation
Published	2019-09-03
URL	https://arxiv.org/abs/1909.01383v2
PDF	https://arxiv.org/pdf/1909.01383v2.pdf
PWC	https://paperswithcode.com/paper/context-aware-monolingual-repair-for-neural
Repo
Framework

A Consistent Independence Test for Multivariate Time-Series


Title	A Consistent Independence Test for Multivariate Time-Series
Authors	Ronak Mehta, Cencheng Shen, Ting Xu, Joshua T. Vogelstein
Abstract	A fundamental problem in statistical data analysis is testing whether two phenomena are related. When the phenomena in question are time series, many challenges emerge. The first is defining a dependence measure between time series at the population level, as well as a sample level test statistic. The second is computing or estimating the distribution of this test statistic under the null, as the permutation test procedure is invalid for most time series structures. This work aims to address these challenges by combining distance correlation and multiscale graph correlation (MGC) from independence testing literature and block permutation testing from time series analysis. Two hypothesis tests for testing the independence of time series are proposed. These procedures also characterize whether the dependence relationship between the series is linear or nonlinear, and the time lag at which this dependence is maximized. For strictly stationary auto-regressive moving average (ARMA) processes, the proposed independence tests are proven valid and consistent. Finally, neural connectivity in the brain is analyzed using fMRI data, revealing linear dependence of signals within the visual network and default mode network, and nonlinear relationships in other regions. This work opens up new theoretical and practical directions for many modern time series analysis problems.
Tasks	Time Series, Time Series Analysis
Published	2019-08-18
URL	https://arxiv.org/abs/1908.06486v2
PDF	https://arxiv.org/pdf/1908.06486v2.pdf
PWC	https://paperswithcode.com/paper/a-consistent-independence-test-for
Repo
Framework

Using a Pitch-Synchronous Residual Codebook for Hybrid HMM/Frame Selection Speech Synthesis


Title	Using a Pitch-Synchronous Residual Codebook for Hybrid HMM/Frame Selection Speech Synthesis
Authors	Thomas Drugman, Alexis Moinet, Thierry Dutoit, Geoffrey Wilfart
Abstract	This paper proposes a method to improve the quality delivered by statistical parametric speech synthesizers. For this, we use a codebook of pitch-synchronous residual frames, so as to construct a more realistic source signal. First a limited codebook of typical excitations is built from some training database. During the synthesis part, HMMs are used to generate filter and source coefficients. The latter coefficients contain both the pitch and a compact representation of target residual frames. The source signal is obtained by concatenating excitation frames picked up from the codebook, based on a selection criterion and taking target residual coefficients as input. Subjective results show a relevant improvement compared to the basic technique.
Tasks	Speech Synthesis
Published	2019-12-30
URL	https://arxiv.org/abs/1912.12887v1
PDF	https://arxiv.org/pdf/1912.12887v1.pdf
PWC	https://paperswithcode.com/paper/using-a-pitch-synchronous-residual-codebook
Repo
Framework

Learning to Disentangle Robust and Vulnerable Features for Adversarial Detection


Title	Learning to Disentangle Robust and Vulnerable Features for Adversarial Detection
Authors	Byunggill Joe, Sung Ju Hwang, Insik Shin
Abstract	Although deep neural networks have shown promising performances on various tasks, even achieving human-level performance on some, they are shown to be susceptible to incorrect predictions even with imperceptibly small perturbations to an input. There exists a large number of previous works which proposed to defend against such adversarial attacks either by robust inference or detection of adversarial inputs. Yet, most of them cannot effectively defend against whitebox attacks where an adversary has a knowledge of the model and defense. More importantly, they do not provide a convincing reason why the generated adversarial inputs successfully fool the target models. To address these shortcomings of the existing approaches, we hypothesize that the adversarial inputs are tied to latent features that are susceptible to adversarial perturbation, which we call vulnerable features. Then based on this intuition, we propose a minimax game formulation to disentangle the latent features of each instance into robust and vulnerable ones, using variational autoencoders with two latent spaces. We thoroughly validate our model for both blackbox and whitebox attacks on MNIST, Fashion MNIST5, and Cat & Dog datasets, whose results show that the adversarial inputs cannot bypass our detector without changing its semantics, in which case the attack has failed.
Tasks
Published	2019-09-10
URL	https://arxiv.org/abs/1909.04311v1
PDF	https://arxiv.org/pdf/1909.04311v1.pdf
PWC	https://paperswithcode.com/paper/learning-to-disentangle-robust-and-vulnerable
Repo
Framework

Optimal Immunization Policy Using Dynamic Programming


Title	Optimal Immunization Policy Using Dynamic Programming
Authors	Atiye Alaeddini, Daniel Klein
Abstract	Decisions in public health are almost always made in the context of uncertainty. Policy makers responsible for making important decisions are faced with the daunting task of choosing from many possible options. This task is called planning under uncertainty, and is particularly acute when addressing complex systems, such as issues of global health and development. Decision making under uncertainty is a challenging task, and all too often this uncertainty is averaged away to simplify results for policy makers. A popular way to approach this task is to formulate the problem at hand as a (partially observable) Markov decision process, (PO)MDP. This work aims to apply these AI efforts to challenging problems in health and development. In this paper, we developed a framework for optimal health policy design in a dynamic setting. We apply a stochastic dynamic programing approach to identify both the optimal time to change the health intervention policy and the optimal time to collect decision relevant information.
Tasks	Decision Making, Decision Making Under Uncertainty
Published	2019-10-19
URL	https://arxiv.org/abs/1910.08677v1
PDF	https://arxiv.org/pdf/1910.08677v1.pdf
PWC	https://paperswithcode.com/paper/optimal-immunization-policy-using-dynamic
Repo
Framework

The Transference Architecture for Automatic Post-Editing


Title	The Transference Architecture for Automatic Post-Editing
Authors	Santanu Pal, Hongfei Xu, Nico Herbig, Sudip Kumar Naskar, Antonio Krueger, Josef van Genabith
Abstract	In automatic post-editing (APE) it makes sense to condition post-editing (pe) decisions on both the source (src) and the machine translated text (mt) as input. This has led to multi-source encoder based APE approaches. A research challenge now is the search for architectures that best support the capture, preparation and provision of src and mt information and its integration with pe decisions. In this paper we present a new multi-source APE model, called transference. Unlike previous approaches, it (i) uses a transformer encoder block for src, (ii) followed by a decoder block, but without masking for self-attention on mt, which effectively acts as second encoder combining src -> mt, and (iii) feeds this representation into a final decoder block generating pe. Our model outperforms the state-of-the-art by 1 BLEU point on the WMT 2016, 2017, and 2018 English–German APE shared tasks (PBSMT and NMT). We further investigate the importance of our newly introduced second encoder and find that a too small amount of layers does hurt the performance, while reducing the number of layers of the decoder does not matter much.
Tasks	Automatic Post-Editing
Published	2019-08-16
URL	https://arxiv.org/abs/1908.06151v2
PDF	https://arxiv.org/pdf/1908.06151v2.pdf
PWC	https://paperswithcode.com/paper/the-transference-architecture-for-automatic
Repo
Framework

Interpreting Predictive Process Monitoring Benchmarks


Title	Interpreting Predictive Process Monitoring Benchmarks
Authors	Renuka Sindhgatta, Chun Ouyang, Catarina Moreira, Yi Liao
Abstract	Predictive process analytics aims to predict the future behaviour of an ongoing business process instance such as the next event, the remaining time, or outcome of the running instance. Machine learning models can be trained on event log data recording historical process execution to build predictive models for predicting the future behaviour of process execution. Multiple techniques have been proposed so far which encode the information available in an event log and construct input features required to train a predictive model. While accuracy has been a dominant criterion in the choice of various techniques, we derive explanations using interpretable machine learning techniques to compare and contrast the suitability of multiple predictive models of high accuracy. The explanations allow us to gain an understanding of the underlying reasons for a prediction and highlight scenarios where accuracy alone may not be sufficient in assessing the suitability of techniques used to encode event log data to features used by a predictive model. Findings from this exploratory study motivate the need to incorporate interpretability in predictive process analytics.
Tasks	Interpretable Machine Learning
Published	2019-12-22
URL	https://arxiv.org/abs/1912.10558v2
PDF	https://arxiv.org/pdf/1912.10558v2.pdf
PWC	https://paperswithcode.com/paper/interpreting-predictive-process-monitoring
Repo
Framework

Machine Learning Based Channel Estimation: A Computational Approach for Universal Channel Conditions


Title	Machine Learning Based Channel Estimation: A Computational Approach for Universal Channel Conditions
Authors	Kai Mei, Jun Liu, Xiaochen Zhang, Jibo Wei
Abstract	Recently, machine learning has been introduced in communications to deal with channel estimation. Under non-linear system models, the superiority of machine learning based estimation has been demonstrated by simulation expriments, but the theoretical analysis is not sufficient, since the performance of machine learning, especially deep learning, is hard to analyze. This paper focuses on some theoretical problems in machine learning based channel estimation. As a data-driven method, certain amount of training data is the prerequisite of a workable machine learning based estimation, and it is analyzed qualitively in a statistic view in this paper. To deduce the exact sample size, we build a statistic model ignoring the exact structure of the learning module and then the relationship between sample size and learning performance is derived. To testify our analysis, we employ machine learning based channel estimation in OFDM system and apply two typical neural networks as the learning module: single layer or linear structure and three layer structure. The simulation results show that the analysis sample size is correct when input dimension and complexity of learning module are low, but the true required sample size will be larger the analysis result otherwise, since the influence of the two factors is not considered in the analysis of sample size. Also, we simulate the performance of machine learning based channel estimation under quasi-stationary channel condition, where the explicit form of MMSE estimation is hard to obtain, and the simulation results exhibit the effectiveness and convenience of machine learning based channel estimation under complex channel models.
Tasks
Published	2019-11-10
URL	https://arxiv.org/abs/1911.03886v1
PDF	https://arxiv.org/pdf/1911.03886v1.pdf
PWC	https://paperswithcode.com/paper/machine-learning-based-channel-estimation-a
Repo
Framework

Simultaneous Implementation Features Extraction and Recognition Using C3D Network for WiFi-based Human Activity Recognition


Title	Simultaneous Implementation Features Extraction and Recognition Using C3D Network for WiFi-based Human Activity Recognition
Authors	Liu Yafeng, Chen Tian, Liu Zhongyu, Zhang Lei, Hu Yanjun, Ding Enjie
Abstract	Human actions recognition has attracted more and more people’s attention. Many technology have been developed to express human action’s features, such as image, skeleton-based, and channel state information(CSI). Among them, on account of CSI’s easy to be equipped and undemanding for light, and it has gained more and more attention in some special scene. However, the relationship between CSI signal and human actions is very complex, and some preliminary work must be done to make CSI features easy to understand for computer. Nowadays, many work departed CSI-based features’ action dealing into two parts. One part is for features extraction and dimension reduce, and the other part is for time series problems. Some of them even omitted one of the two part work. Therefore, the accuracies of current recognition systems are far from satisfactory. In this paper, we propose a new deep learning based approach, i.e. C3D network and C3D network with attention mechanism, for human actions recognition using CSI signals. This kind of network can make feature extraction from spatial convolution and temporal convolution simultaneously, and through this network the two part of CSI-based human actions recognition mentioned above can be realized at the same time. The entire algorithm structure is simplified. The experimental results show that our proposed C3D network is able to achieve the best recognition performance for all activities when compared with some benchmark approaches.
Tasks	Activity Recognition, Human Activity Recognition, Time Series
Published	2019-11-21
URL	https://arxiv.org/abs/1911.09325v1
PDF	https://arxiv.org/pdf/1911.09325v1.pdf
PWC	https://paperswithcode.com/paper/simultaneous-implementation-features
Repo
Framework

Opinion aspect extraction in Dutch childrens diary entries


Title	Opinion aspect extraction in Dutch childrens diary entries
Authors	Hella Haanstra, Maaike H. T. de Boer
Abstract	Aspect extraction can be used in dialogue systems to understand the topic of opinionated text. Expressing an empathetic reaction to an opinion can strengthen the bond between a human and, for example, a robot. The aim of this study is three-fold: 1. create a new annotated dataset for both aspect extraction and opinion words for Dutch childrens language, 2. acquire aspect extraction results for this task and 3. improve current results for aspect extraction in Dutch reviews. This was done by training a deep learning Gated Recurrent Unit (GRU) model, originally developed for an English review dataset, on Dutch restaurant review data to classify both opinion words and their respective aspects. We obtained state-of-the-art performance on the Dutch restaurant review dataset. Additionally, we acquired aspect extraction results for the Dutch childrens dataset. Since the model was trained on standardised language, these results are quite promising.
Tasks	Aspect Extraction
Published	2019-10-21
URL	https://arxiv.org/abs/1910.10502v1
PDF	https://arxiv.org/pdf/1910.10502v1.pdf
PWC	https://paperswithcode.com/paper/opinion-aspect-extraction-in-dutch-childrens
Repo
Framework