February 1, 2020

3212 words 16 mins read

Paper Group AWR 211

Side Window Filtering. On Binscatter. Rethinking Text Attribute Transfer: A Lexical Analysis. OpenSpiel: A Framework for Reinforcement Learning in Games. Certifiable Robustness and Robust Training for Graph Convolutional Networks. Rethinking Irregular Scene Text Recognition. Patch augmentation: Towards efficient decision boundaries for neural netwo …

Side Window Filtering


Title	Side Window Filtering
Authors	Hui Yin, Yuanhao Gong, Guoping Qiu
Abstract	Local windows are routinely used in computer vision and almost without exception the center of the window is aligned with the pixels being processed. We show that this conventional wisdom is not universally applicable. When a pixel is on an edge, placing the center of the window on the pixel is one of the fundamental reasons that cause many filtering algorithms to blur the edges. Based on this insight, we propose a new Side Window Filtering (SWF) technique which aligns the window’s side or corner with the pixel being processed. The SWF technique is surprisingly simple yet theoretically rooted and very effective in practice. We show that many traditional linear and nonlinear filters can be easily implemented under the SWF framework. Extensive analysis and experiments show that implementing the SWF principle can significantly improve their edge preserving capabilities and achieve state of the art performances in applications such as image smoothing, denoising, enhancement, structure-preserving texture-removing, mutual-structure extraction, and HDR tone mapping. In addition to image filtering, we further show that the SWF principle can be extended to other applications involving the use of a local window. Using colorization by optimization as an example, we demonstrate that implementing the SWF principle can effectively prevent artifacts such as color leakage associated with the conventional implementation. Given the ubiquity of window based operations in computer vision, the new SWF technique is likely to benefit many more applications.
Tasks	Colorization, Denoising
Published	2019-05-17
URL	https://arxiv.org/abs/1905.07177v1
PDF	https://arxiv.org/pdf/1905.07177v1.pdf
PWC	https://paperswithcode.com/paper/side-window-filtering
Repo	https://github.com/wang-kangkang/SideWindowFilter-pytorch
Framework	pytorch

On Binscatter


Title	On Binscatter
Authors	Matias D. Cattaneo, Richard K. Crump, Max H. Farrell, Yingjie Feng
Abstract	Binscatter is very popular in applied microeconomics. It provides a flexible, yet parsimonious way of visualizing and summarizing large data sets in regression settings, and it is often used for informal evaluation of substantive hypotheses such as linearity or monotonicity of the regression function. This paper presents a foundational, thorough analysis of binscatter: we give an array of theoretical and practical results that aid both in understanding current practices (i.e., their validity or lack thereof) and in offering theory-based guidance for future applications. Our main results include principled number of bins selection, confidence intervals and bands, hypothesis tests for parametric and shape restrictions of the regression function, and several other new methods, applicable to canonical binscatter as well as higher-order polynomial, covariate-adjusted and smoothness-restricted extensions thereof. In particular, we highlight important methodological problems related to covariate adjustment methods used in current practice. We also discuss extensions to clustered data. Our results are illustrated with simulated and real data throughout. Companion general-purpose software packages for \texttt{Stata} and \texttt{R} are provided. Finally, from a technical perspective, new theoretical results for partitioning-based series estimation are obtained that may be of independent interest.
Tasks
Published	2019-02-25
URL	http://arxiv.org/abs/1902.09608v1
PDF	http://arxiv.org/pdf/1902.09608v1.pdf
PWC	https://paperswithcode.com/paper/on-binscatter
Repo	https://github.com/jiafengkevinchen/binscatterplot
Framework	none

Rethinking Text Attribute Transfer: A Lexical Analysis


Title	Rethinking Text Attribute Transfer: A Lexical Analysis
Authors	Yao Fu, Hao Zhou, Jiaze Chen, Lei Li
Abstract	Text attribute transfer is modifying certain linguistic attributes (e.g. sentiment, style, authorship, etc.) of a sentence and transforming them from one type to another. In this paper, we aim to analyze and interpret what is changed during the transfer process. We start from the observation that in many existing models and datasets, certain words within a sentence play important roles in determining the sentence attribute class. These words are referred to as \textit{the Pivot Words}. Based on these pivot words, we propose a lexical analysis framework, \textit{the Pivot Analysis}, to quantitatively analyze the effects of these words in text attribute classification and transfer. We apply this framework to existing datasets and models and show that: (1) the pivot words are strong features for the classification of sentence attributes; (2) to change the attribute of a sentence, many datasets only requires to change certain pivot words; (3) consequently, many transfer models only perform the lexical-level modification, while leaving higher-level sentence structures unchanged. Our work provides an in-depth understanding of linguistic attribute transfer and further identifies the future requirements and challenges of this task\footnote{Our code can be found at https://github.com/FranxYao/pivot_analysis}.
Tasks	Lexical Analysis, Text Attribute Transfer
Published	2019-09-26
URL	https://arxiv.org/abs/1909.12335v1
PDF	https://arxiv.org/pdf/1909.12335v1.pdf
PWC	https://paperswithcode.com/paper/rethinking-text-attribute-transfer-a-lexical
Repo	https://github.com/FranxYao/pivot_analysis
Framework	none

OpenSpiel: A Framework for Reinforcement Learning in Games


Title	OpenSpiel: A Framework for Reinforcement Learning in Games
Authors	Marc Lanctot, Edward Lockhart, Jean-Baptiste Lespiau, Vinicius Zambaldi, Satyaki Upadhyay, Julien Pérolat, Sriram Srinivasan, Finbarr Timbers, Karl Tuyls, Shayegan Omidshafiei, Daniel Hennes, Dustin Morrill, Paul Muller, Timo Ewalds, Ryan Faulkner, János Kramár, Bart De Vylder, Brennan Saeta, James Bradbury, David Ding, Sebastian Borgeaud, Matthew Lai, Julian Schrittwieser, Thomas Anthony, Edward Hughes, Ivo Danihelka, Jonah Ryan-Davis
Abstract	OpenSpiel is a collection of environments and algorithms for research in general reinforcement learning and search/planning in games. OpenSpiel supports n-player (single- and multi- agent) zero-sum, cooperative and general-sum, one-shot and sequential, strictly turn-taking and simultaneous-move, perfect and imperfect information games, as well as traditional multiagent environments such as (partially- and fully- observable) grid worlds and social dilemmas. OpenSpiel also includes tools to analyze learning dynamics and other common evaluation metrics. This document serves both as an overview of the code base and an introduction to the terminology, core concepts, and algorithms across the fields of reinforcement learning, computational game theory, and search.
Tasks
Published	2019-08-26
URL	https://arxiv.org/abs/1908.09453v5
PDF	https://arxiv.org/pdf/1908.09453v5.pdf
PWC	https://paperswithcode.com/paper/openspiel-a-framework-for-reinforcement
Repo	https://github.com/deepmind/open_spiel
Framework	none

Certifiable Robustness and Robust Training for Graph Convolutional Networks


Title	Certifiable Robustness and Robust Training for Graph Convolutional Networks
Authors	Daniel Zügner, Stephan Günnemann
Abstract	Recent works show that Graph Neural Networks (GNNs) are highly non-robust with respect to adversarial attacks on both the graph structure and the node attributes, making their outcomes unreliable. We propose the first method for certifiable (non-)robustness of graph convolutional networks with respect to perturbations of the node attributes. We consider the case of binary node attributes (e.g. bag-of-words) and perturbations that are L_0-bounded. If a node has been certified with our method, it is guaranteed to be robust under any possible perturbation given the attack model. Likewise, we can certify non-robustness. Finally, we propose a robust semi-supervised training procedure that treats the labeled and unlabeled nodes jointly. As shown in our experimental evaluation, our method significantly improves the robustness of the GNN with only minimal effect on the predictive accuracy.
Tasks	Node Classification
Published	2019-06-28
URL	https://arxiv.org/abs/1906.12269v1
PDF	https://arxiv.org/pdf/1906.12269v1.pdf
PWC	https://paperswithcode.com/paper/certifiable-robustness-and-robust-training
Repo	https://github.com/danielzuegner/robust-gcn
Framework	pytorch

Rethinking Irregular Scene Text Recognition


Title	Rethinking Irregular Scene Text Recognition
Authors	Shangbang Long, Yushuo Guan, Bingxuan Wang, Kaigui Bian, Cong Yao
Abstract	Reading text from natural images is challenging due to the great variety in text font, color, size, complex background and etc.. The perspective distortion and non-linear spatial arrangement of characters make it further difficult. While rectification based method is intuitively grounded and has pushed the envelope by far, its potential is far from being well exploited. In this paper, we present a bag of tricks that prove to significantly improve the performance of rectification based method. On curved text dataset, our method achieves an accuracy of 89.6% on CUTE-80 and 76.3% on Total-Text, an improvement over previous state-of-the-art by 6.3% and 14.7% respectively. Furthermore, our combination of tricks helps us win the ICDAR 2019 Arbitrary-Shaped Text Challenge (Latin script), achieving an accuracy of 74.3% on the held-out test set. We release our code as well as data samples for further exploration at https://github.com/Jyouhou/ICDAR2019-ArT-Recognition-Alchemy
Tasks	Scene Text Detection
Published	2019-08-30
URL	https://arxiv.org/abs/1908.11834v2
PDF	https://arxiv.org/pdf/1908.11834v2.pdf
PWC	https://paperswithcode.com/paper/alchemy-techniques-for-rectification-based
Repo	https://github.com/Jyouhou/ICDAR2019-ArT-Recognition-Alchemy
Framework	pytorch

Patch augmentation: Towards efficient decision boundaries for neural networks


Title	Patch augmentation: Towards efficient decision boundaries for neural networks
Authors	Marcus D. Bloice, Peter M. Roth, Andreas Holzinger
Abstract	In this paper we propose a new augmentation technique, called patch augmentation, that, in our experiments, improves model accuracy and makes networks more robust to adversarial attacks. In brief, this data-independent approach creates new image data based on image/label pairs, where a patch from one of the two images in the pair is superimposed on to the other image, creating a new augmented sample. The new image’s label is a linear combination of the image pair’s corresponding labels. Initial experiments show a several percentage point increase in accuracy on CIFAR-10, from a baseline of approximately 81% to 89%. CIFAR-100 sees larger improvements still, from a baseline of 52% to 68% accuracy. Networks trained using patch augmentation are also more robust to adversarial attacks, which we demonstrate using the Fast Gradient Sign Method.
Tasks	Adversarial Attack
Published	2019-11-08
URL	https://arxiv.org/abs/1911.07922v2
PDF	https://arxiv.org/pdf/1911.07922v2.pdf
PWC	https://paperswithcode.com/paper/patch-augmentation-towards-efficient-decision
Repo	https://github.com/mdbloice/Patch-Augmentation
Framework	none

Capsule Neural Networks for Graph Classification using Explicit Tensorial Graph Representations


Title	Capsule Neural Networks for Graph Classification using Explicit Tensorial Graph Representations
Authors	Marcelo Daniel Gutierrez Mallea, Peter Meltzer, Peter J Bentley
Abstract	Graph classification is a significant problem in many scientific domains. It addresses tasks such as the classification of proteins and chemical compounds into categories according to their functions, or chemical and structural properties. In a supervised setting, this problem can be framed as learning the structure, features and relationships between features within a set of labelled graphs and being able to correctly predict the labels or categories of unseen graphs. A significant difficulty in this task arises when attempting to apply established classification algorithms due to the requirement for fixed size matrix or tensor representations of the graphs which may vary greatly in their numbers of nodes and edges. Building on prior work combining explicit tensor representations with a standard image-based classifier, we propose a model to perform graph classification by extracting fixed size tensorial information from each graph in a given set, and using a Capsule Network to perform classification. The graphs we consider here are undirected and with categorical features on the nodes. Using standard benchmarking chemical and protein datasets, we demonstrate that our graph Capsule Network classification model using an explicit tensorial representation of the graphs is competitive with current state of the art graph kernels and graph neural network models despite only limited hyper-parameter searching.
Tasks	Graph Classification
Published	2019-02-22
URL	http://arxiv.org/abs/1902.08399v1
PDF	http://arxiv.org/pdf/1902.08399v1.pdf
PWC	https://paperswithcode.com/paper/capsule-neural-networks-for-graph
Repo	https://github.com/BraintreeLtd/PatchyCapsules
Framework	none

Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network


Title	Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network
Authors	Wenhai Wang, Enze Xie, Xiaoge Song, Yuhang Zang, Wenjia Wang, Tong Lu, Gang Yu, Chunhua Shen
Abstract	Scene text detection, an important step of scene text reading systems, has witnessed rapid development with convolutional neural networks. Nonetheless, two main challenges still exist and hamper its deployment to real-world applications. The first problem is the trade-off between speed and accuracy. The second one is to model the arbitrary-shaped text instance. Recently, some methods have been proposed to tackle arbitrary-shaped text detection, but they rarely take the speed of the entire pipeline into consideration, which may fall short in practical applications.In this paper, we propose an efficient and accurate arbitrary-shaped text detector, termed Pixel Aggregation Network (PAN), which is equipped with a low computational-cost segmentation head and a learnable post-processing. More specifically, the segmentation head is made up of Feature Pyramid Enhancement Module (FPEM) and Feature Fusion Module (FFM). FPEM is a cascadable U-shaped module, which can introduce multi-level information to guide the better segmentation. FFM can gather the features given by the FPEMs of different depths into a final feature for segmentation. The learnable post-processing is implemented by Pixel Aggregation (PA), which can precisely aggregate text pixels by predicted similarity vectors. Experiments on several standard benchmarks validate the superiority of the proposed PAN. It is worth noting that our method can achieve a competitive F-measure of 79.9% at 84.2 FPS on CTW1500.
Tasks	Scene Text Detection
Published	2019-08-16
URL	https://arxiv.org/abs/1908.05900v1
PDF	https://arxiv.org/pdf/1908.05900v1.pdf
PWC	https://paperswithcode.com/paper/efficient-and-accurate-arbitrary-shaped-text
Repo	https://github.com/insightcs/PAN-Pytorch
Framework	pytorch

Square Attack: a query-efficient black-box adversarial attack via random search


Title	Square Attack: a query-efficient black-box adversarial attack via random search
Authors	Maksym Andriushchenko, Francesco Croce, Nicolas Flammarion, Matthias Hein
Abstract	We propose the Square Attack, a score-based black-box $l_2$- and $l_\infty$-adversarial attack that does not rely on local gradient information and thus is not affected by gradient masking. Square Attack is based on a randomized search scheme which selects localized square-shaped updates at random positions so that at each iteration the perturbation is situated approximately at the boundary of the feasible set. Our method is significantly more query efficient and achieves a higher success rate compared to the state-of-the-art methods, especially in the untargeted setting. In particular, on ImageNet we improve the average query efficiency in the untargeted setting for various deep networks by a factor of at least $1.8$ and up to $3$ compared to the recent state-of-the-art $l_\infty$-attack of Al-Dujaili & O’Reilly. Moreover, although our attack is black-box, it can also outperform gradient-based white-box attacks on the standard benchmarks achieving a new state-of-the-art in terms of the success rate. The code of our attack is available at https://github.com/max-andr/square-attack.
Tasks	Adversarial Attack
Published	2019-11-29
URL	https://arxiv.org/abs/1912.00049v2
PDF	https://arxiv.org/pdf/1912.00049v2.pdf
PWC	https://paperswithcode.com/paper/square-attack-a-query-efficient-black-box
Repo	https://github.com/max-andr/square-attack
Framework	tf

Intermittent Learning: On-Device Machine Learning on Intermittently Powered System


Title	Intermittent Learning: On-Device Machine Learning on Intermittently Powered System
Authors	Seulki Lee, Bashima Islam, Yubo Luo, Shahriar Nirjon
Abstract	This paper introduces intermittent learning - the goal of which is to enable energy harvested computing platforms capable of executing certain classes of machine learning tasks effectively and efficiently. We identify unique challenges to intermittent learning relating to the data and application semantics of machine learning tasks, and to address these challenges, we devise 1) an algorithm that determines a sequence of actions to achieve the desired learning objective under tight energy constraints, and 2) propose three heuristics that help an intermittent learner decide whether to learn or discard training examples at run-time which increases the energy efficiency of the system. We implement and evaluate three intermittent learning applications that learn the 1) air quality, 2) human presence, and 3) vibration using solar, RF, and kinetic energy harvesters, respectively. We demonstrate that the proposed framework improves the energy efficiency of a learner by up to 100% and cuts down the number of learning examples by up to 50% when compared to state-of-the-art intermittent computing systems that do not implement the proposed intermittent learning framework.
Tasks
Published	2019-04-21
URL	https://arxiv.org/abs/1904.09644v2
PDF	https://arxiv.org/pdf/1904.09644v2.pdf
PWC	https://paperswithcode.com/paper/intermittent-learning-on-device-machine
Repo	https://github.com/learning1234embed/Intermittent-Learning
Framework	none

Facebook FAIR’s WMT19 News Translation Task Submission


Title	Facebook FAIR’s WMT19 News Translation Task Submission
Authors	Nathan Ng, Kyra Yee, Alexei Baevski, Myle Ott, Michael Auli, Sergey Edunov
Abstract	This paper describes Facebook FAIR’s submission to the WMT19 shared news translation task. We participate in two language pairs and four language directions, English <-> German and English <-> Russian. Following our submission from last year, our baseline systems are large BPE-based transformer models trained with the Fairseq sequence modeling toolkit which rely on sampled back-translations. This year we experiment with different bitext data filtering schemes, as well as with adding filtered back-translated data. We also ensemble and fine-tune our models on domain-specific data, then decode using noisy channel model reranking. Our submissions are ranked first in all four directions of the human evaluation campaign. On En->De, our system significantly outperforms other systems as well as human translations. This system improves upon our WMT’18 submission by 4.5 BLEU points.
Tasks	Machine Translation
Published	2019-07-15
URL	https://arxiv.org/abs/1907.06616v1
PDF	https://arxiv.org/pdf/1907.06616v1.pdf
PWC	https://paperswithcode.com/paper/facebook-fairs-wmt19-news-translation-task
Repo	https://github.com/pytorch/fairseq/tree/master/examples/wmt19
Framework	pytorch

Effective Cross-lingual Transfer of Neural Machine Translation Models without Shared Vocabularies


Title	Effective Cross-lingual Transfer of Neural Machine Translation Models without Shared Vocabularies
Authors	Yunsu Kim, Yingbo Gao, Hermann Ney
Abstract	Transfer learning or multilingual model is essential for low-resource neural machine translation (NMT), but the applicability is limited to cognate languages by sharing their vocabularies. This paper shows effective techniques to transfer a pre-trained NMT model to a new, unrelated language without shared vocabularies. We relieve the vocabulary mismatch by using cross-lingual word embedding, train a more language-agnostic encoder by injecting artificial noises, and generate synthetic data easily from the pre-training data without back-translation. Our methods do not require restructuring the vocabulary or retraining the model. We improve plain NMT transfer by up to +5.1% BLEU in five low-resource translation tasks, outperforming multilingual joint training by a large margin. We also provide extensive ablation studies on pre-trained embedding, synthetic data, vocabulary size, and parameter freezing for a better understanding of NMT transfer.
Tasks	Cross-Lingual Transfer, Low-Resource Neural Machine Translation, Machine Translation, Transfer Learning
Published	2019-05-14
URL	https://arxiv.org/abs/1905.05475v2
PDF	https://arxiv.org/pdf/1905.05475v2.pdf
PWC	https://paperswithcode.com/paper/effective-cross-lingual-transfer-of-neural
Repo	https://github.com/yunsukim86/sockeye-transfer
Framework	mxnet

An Incremental Turn-Taking Model For Task-Oriented Dialog Systems


Title	An Incremental Turn-Taking Model For Task-Oriented Dialog Systems
Authors	Andrei C. Coman, Koichiro Yoshino, Yukitoshi Murase, Satoshi Nakamura, Giuseppe Riccardi
Abstract	In a human-machine dialog scenario, deciding the appropriate time for the machine to take the turn is an open research problem. In contrast, humans engaged in conversations are able to timely decide when to interrupt the speaker for competitive or non-competitive reasons. In state-of-the-art turn-by-turn dialog systems the decision on the next dialog action is taken at the end of the utterance. In this paper, we propose a token-by-token prediction of the dialog state from incremental transcriptions of the user utterance. To identify the point of maximal understanding in an ongoing utterance, we a) implement an incremental Dialog State Tracker which is updated on a token basis (iDST) b) re-label the Dialog State Tracking Challenge 2 (DSTC2) dataset and c) adapt it to the incremental turn-taking experimental scenario. The re-labeling consists of assigning a binary value to each token in the user utterance that allows to identify the appropriate point for taking the turn. Finally, we implement an incremental Turn Taking Decider (iTTD) that is trained on these new labels for the turn-taking decision. We show that the proposed model can achieve a better performance compared to a deterministic handcrafted turn-taking algorithm.
Tasks
Published	2019-05-28
URL	https://arxiv.org/abs/1905.11806v3
PDF	https://arxiv.org/pdf/1905.11806v3.pdf
PWC	https://paperswithcode.com/paper/an-incremental-turn-taking-model-for-task
Repo	https://github.com/ahclab/iDST_iTTD
Framework	none

Recurrent MVSNet for High-resolution Multi-view Stereo Depth Inference


Title	Recurrent MVSNet for High-resolution Multi-view Stereo Depth Inference
Authors	Yao Yao, Zixin Luo, Shiwei Li, Tianwei Shen, Tian Fang, Long Quan
Abstract	Deep learning has recently demonstrated its excellent performance for multi-view stereo (MVS). However, one major limitation of current learned MVS approaches is the scalability: the memory-consuming cost volume regularization makes the learned MVS hard to be applied to high-resolution scenes. In this paper, we introduce a scalable multi-view stereo framework based on the recurrent neural network. Instead of regularizing the entire 3D cost volume in one go, the proposed Recurrent Multi-view Stereo Network (R-MVSNet) sequentially regularizes the 2D cost maps along the depth direction via the gated recurrent unit (GRU). This reduces dramatically the memory consumption and makes high-resolution reconstruction feasible. We first show the state-of-the-art performance achieved by the proposed R-MVSNet on the recent MVS benchmarks. Then, we further demonstrate the scalability of the proposed method on several large-scale scenarios, where previous learned approaches often fail due to the memory constraint. Code is available at https://github.com/YoYo000/MVSNet.
Tasks
Published	2019-02-27
URL	http://arxiv.org/abs/1902.10556v1
PDF	http://arxiv.org/pdf/1902.10556v1.pdf
PWC	https://paperswithcode.com/paper/recurrent-mvsnet-for-high-resolution-multi
Repo	https://github.com/YoYo000/MVSNet
Framework	tf