February 1, 2020

3212 words 16 mins read

Paper Group AWR 211

Paper Group AWR 211

Side Window Filtering. On Binscatter. Rethinking Text Attribute Transfer: A Lexical Analysis. OpenSpiel: A Framework for Reinforcement Learning in Games. Certifiable Robustness and Robust Training for Graph Convolutional Networks. Rethinking Irregular Scene Text Recognition. Patch augmentation: Towards efficient decision boundaries for neural netwo …

Side Window Filtering

Title Side Window Filtering
Authors Hui Yin, Yuanhao Gong, Guoping Qiu
Abstract Local windows are routinely used in computer vision and almost without exception the center of the window is aligned with the pixels being processed. We show that this conventional wisdom is not universally applicable. When a pixel is on an edge, placing the center of the window on the pixel is one of the fundamental reasons that cause many filtering algorithms to blur the edges. Based on this insight, we propose a new Side Window Filtering (SWF) technique which aligns the window’s side or corner with the pixel being processed. The SWF technique is surprisingly simple yet theoretically rooted and very effective in practice. We show that many traditional linear and nonlinear filters can be easily implemented under the SWF framework. Extensive analysis and experiments show that implementing the SWF principle can significantly improve their edge preserving capabilities and achieve state of the art performances in applications such as image smoothing, denoising, enhancement, structure-preserving texture-removing, mutual-structure extraction, and HDR tone mapping. In addition to image filtering, we further show that the SWF principle can be extended to other applications involving the use of a local window. Using colorization by optimization as an example, we demonstrate that implementing the SWF principle can effectively prevent artifacts such as color leakage associated with the conventional implementation. Given the ubiquity of window based operations in computer vision, the new SWF technique is likely to benefit many more applications.
Tasks Colorization, Denoising
Published 2019-05-17
URL https://arxiv.org/abs/1905.07177v1
PDF https://arxiv.org/pdf/1905.07177v1.pdf
PWC https://paperswithcode.com/paper/side-window-filtering
Repo https://github.com/wang-kangkang/SideWindowFilter-pytorch
Framework pytorch

On Binscatter

Title On Binscatter
Authors Matias D. Cattaneo, Richard K. Crump, Max H. Farrell, Yingjie Feng
Abstract Binscatter is very popular in applied microeconomics. It provides a flexible, yet parsimonious way of visualizing and summarizing large data sets in regression settings, and it is often used for informal evaluation of substantive hypotheses such as linearity or monotonicity of the regression function. This paper presents a foundational, thorough analysis of binscatter: we give an array of theoretical and practical results that aid both in understanding current practices (i.e., their validity or lack thereof) and in offering theory-based guidance for future applications. Our main results include principled number of bins selection, confidence intervals and bands, hypothesis tests for parametric and shape restrictions of the regression function, and several other new methods, applicable to canonical binscatter as well as higher-order polynomial, covariate-adjusted and smoothness-restricted extensions thereof. In particular, we highlight important methodological problems related to covariate adjustment methods used in current practice. We also discuss extensions to clustered data. Our results are illustrated with simulated and real data throughout. Companion general-purpose software packages for \texttt{Stata} and \texttt{R} are provided. Finally, from a technical perspective, new theoretical results for partitioning-based series estimation are obtained that may be of independent interest.
Tasks
Published 2019-02-25
URL http://arxiv.org/abs/1902.09608v1
PDF http://arxiv.org/pdf/1902.09608v1.pdf
PWC https://paperswithcode.com/paper/on-binscatter
Repo https://github.com/jiafengkevinchen/binscatterplot
Framework none

Rethinking Text Attribute Transfer: A Lexical Analysis

Title Rethinking Text Attribute Transfer: A Lexical Analysis
Authors Yao Fu, Hao Zhou, Jiaze Chen, Lei Li
Abstract Text attribute transfer is modifying certain linguistic attributes (e.g. sentiment, style, authorship, etc.) of a sentence and transforming them from one type to another. In this paper, we aim to analyze and interpret what is changed during the transfer process. We start from the observation that in many existing models and datasets, certain words within a sentence play important roles in determining the sentence attribute class. These words are referred to as \textit{the Pivot Words}. Based on these pivot words, we propose a lexical analysis framework, \textit{the Pivot Analysis}, to quantitatively analyze the effects of these words in text attribute classification and transfer. We apply this framework to existing datasets and models and show that: (1) the pivot words are strong features for the classification of sentence attributes; (2) to change the attribute of a sentence, many datasets only requires to change certain pivot words; (3) consequently, many transfer models only perform the lexical-level modification, while leaving higher-level sentence structures unchanged. Our work provides an in-depth understanding of linguistic attribute transfer and further identifies the future requirements and challenges of this task\footnote{Our code can be found at https://github.com/FranxYao/pivot_analysis}.
Tasks Lexical Analysis, Text Attribute Transfer
Published 2019-09-26
URL https://arxiv.org/abs/1909.12335v1
PDF https://arxiv.org/pdf/1909.12335v1.pdf
PWC https://paperswithcode.com/paper/rethinking-text-attribute-transfer-a-lexical
Repo https://github.com/FranxYao/pivot_analysis
Framework none

OpenSpiel: A Framework for Reinforcement Learning in Games

Title OpenSpiel: A Framework for Reinforcement Learning in Games
Authors Marc Lanctot, Edward Lockhart, Jean-Baptiste Lespiau, Vinicius Zambaldi, Satyaki Upadhyay, Julien Pérolat, Sriram Srinivasan, Finbarr Timbers, Karl Tuyls, Shayegan Omidshafiei, Daniel Hennes, Dustin Morrill, Paul Muller, Timo Ewalds, Ryan Faulkner, János Kramár, Bart De Vylder, Brennan Saeta, James Bradbury, David Ding, Sebastian Borgeaud, Matthew Lai, Julian Schrittwieser, Thomas Anthony, Edward Hughes, Ivo Danihelka, Jonah Ryan-Davis
Abstract OpenSpiel is a collection of environments and algorithms for research in general reinforcement learning and search/planning in games. OpenSpiel supports n-player (single- and multi- agent) zero-sum, cooperative and general-sum, one-shot and sequential, strictly turn-taking and simultaneous-move, perfect and imperfect information games, as well as traditional multiagent environments such as (partially- and fully- observable) grid worlds and social dilemmas. OpenSpiel also includes tools to analyze learning dynamics and other common evaluation metrics. This document serves both as an overview of the code base and an introduction to the terminology, core concepts, and algorithms across the fields of reinforcement learning, computational game theory, and search.
Tasks
Published 2019-08-26
URL https://arxiv.org/abs/1908.09453v5
PDF https://arxiv.org/pdf/1908.09453v5.pdf
PWC https://paperswithcode.com/paper/openspiel-a-framework-for-reinforcement
Repo https://github.com/deepmind/open_spiel
Framework none

Certifiable Robustness and Robust Training for Graph Convolutional Networks

Title Certifiable Robustness and Robust Training for Graph Convolutional Networks
Authors Daniel Zügner, Stephan Günnemann
Abstract Recent works show that Graph Neural Networks (GNNs) are highly non-robust with respect to adversarial attacks on both the graph structure and the node attributes, making their outcomes unreliable. We propose the first method for certifiable (non-)robustness of graph convolutional networks with respect to perturbations of the node attributes. We consider the case of binary node attributes (e.g. bag-of-words) and perturbations that are L_0-bounded. If a node has been certified with our method, it is guaranteed to be robust under any possible perturbation given the attack model. Likewise, we can certify non-robustness. Finally, we propose a robust semi-supervised training procedure that treats the labeled and unlabeled nodes jointly. As shown in our experimental evaluation, our method significantly improves the robustness of the GNN with only minimal effect on the predictive accuracy.
Tasks Node Classification
Published 2019-06-28
URL https://arxiv.org/abs/1906.12269v1
PDF https://arxiv.org/pdf/1906.12269v1.pdf
PWC https://paperswithcode.com/paper/certifiable-robustness-and-robust-training
Repo https://github.com/danielzuegner/robust-gcn
Framework pytorch

Rethinking Irregular Scene Text Recognition

Title Rethinking Irregular Scene Text Recognition
Authors Shangbang Long, Yushuo Guan, Bingxuan Wang, Kaigui Bian, Cong Yao
Abstract Reading text from natural images is challenging due to the great variety in text font, color, size, complex background and etc.. The perspective distortion and non-linear spatial arrangement of characters make it further difficult. While rectification based method is intuitively grounded and has pushed the envelope by far, its potential is far from being well exploited. In this paper, we present a bag of tricks that prove to significantly improve the performance of rectification based method. On curved text dataset, our method achieves an accuracy of 89.6% on CUTE-80 and 76.3% on Total-Text, an improvement over previous state-of-the-art by 6.3% and 14.7% respectively. Furthermore, our combination of tricks helps us win the ICDAR 2019 Arbitrary-Shaped Text Challenge (Latin script), achieving an accuracy of 74.3% on the held-out test set. We release our code as well as data samples for further exploration at https://github.com/Jyouhou/ICDAR2019-ArT-Recognition-Alchemy
Tasks Scene Text Detection
Published 2019-08-30
URL https://arxiv.org/abs/1908.11834v2
PDF https://arxiv.org/pdf/1908.11834v2.pdf
PWC https://paperswithcode.com/paper/alchemy-techniques-for-rectification-based
Repo https://github.com/Jyouhou/ICDAR2019-ArT-Recognition-Alchemy
Framework pytorch

Patch augmentation: Towards efficient decision boundaries for neural networks

Title Patch augmentation: Towards efficient decision boundaries for neural networks
Authors Marcus D. Bloice, Peter M. Roth, Andreas Holzinger
Abstract In this paper we propose a new augmentation technique, called patch augmentation, that, in our experiments, improves model accuracy and makes networks more robust to adversarial attacks. In brief, this data-independent approach creates new image data based on image/label pairs, where a patch from one of the two images in the pair is superimposed on to the other image, creating a new augmented sample. The new image’s label is a linear combination of the image pair’s corresponding labels. Initial experiments show a several percentage point increase in accuracy on CIFAR-10, from a baseline of approximately 81% to 89%. CIFAR-100 sees larger improvements still, from a baseline of 52% to 68% accuracy. Networks trained using patch augmentation are also more robust to adversarial attacks, which we demonstrate using the Fast Gradient Sign Method.
Tasks Adversarial Attack
Published 2019-11-08
URL https://arxiv.org/abs/1911.07922v2
PDF https://arxiv.org/pdf/1911.07922v2.pdf
PWC https://paperswithcode.com/paper/patch-augmentation-towards-efficient-decision
Repo https://github.com/mdbloice/Patch-Augmentation
Framework none

Capsule Neural Networks for Graph Classification using Explicit Tensorial Graph Representations

Title Capsule Neural Networks for Graph Classification using Explicit Tensorial Graph Representations
Authors Marcelo Daniel Gutierrez Mallea, Peter Meltzer, Peter J Bentley
Abstract Graph classification is a significant problem in many scientific domains. It addresses tasks such as the classification of proteins and chemical compounds into categories according to their functions, or chemical and structural properties. In a supervised setting, this problem can be framed as learning the structure, features and relationships between features within a set of labelled graphs and being able to correctly predict the labels or categories of unseen graphs. A significant difficulty in this task arises when attempting to apply established classification algorithms due to the requirement for fixed size matrix or tensor representations of the graphs which may vary greatly in their numbers of nodes and edges. Building on prior work combining explicit tensor representations with a standard image-based classifier, we propose a model to perform graph classification by extracting fixed size tensorial information from each graph in a given set, and using a Capsule Network to perform classification. The graphs we consider here are undirected and with categorical features on the nodes. Using standard benchmarking chemical and protein datasets, we demonstrate that our graph Capsule Network classification model using an explicit tensorial representation of the graphs is competitive with current state of the art graph kernels and graph neural network models despite only limited hyper-parameter searching.
Tasks Graph Classification
Published 2019-02-22
URL http://arxiv.org/abs/1902.08399v1
PDF http://arxiv.org/pdf/1902.08399v1.pdf
PWC https://paperswithcode.com/paper/capsule-neural-networks-for-graph
Repo https://github.com/BraintreeLtd/PatchyCapsules
Framework none

Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network

Title Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network
Authors Wenhai Wang, Enze Xie, Xiaoge Song, Yuhang Zang, Wenjia Wang, Tong Lu, Gang Yu, Chunhua Shen
Abstract Scene text detection, an important step of scene text reading systems, has witnessed rapid development with convolutional neural networks. Nonetheless, two main challenges still exist and hamper its deployment to real-world applications. The first problem is the trade-off between speed and accuracy. The second one is to model the arbitrary-shaped text instance. Recently, some methods have been proposed to tackle arbitrary-shaped text detection, but they rarely take the speed of the entire pipeline into consideration, which may fall short in practical applications.In this paper, we propose an efficient and accurate arbitrary-shaped text detector, termed Pixel Aggregation Network (PAN), which is equipped with a low computational-cost segmentation head and a learnable post-processing. More specifically, the segmentation head is made up of Feature Pyramid Enhancement Module (FPEM) and Feature Fusion Module (FFM). FPEM is a cascadable U-shaped module, which can introduce multi-level information to guide the better segmentation. FFM can gather the features given by the FPEMs of different depths into a final feature for segmentation. The learnable post-processing is implemented by Pixel Aggregation (PA), which can precisely aggregate text pixels by predicted similarity vectors. Experiments on several standard benchmarks validate the superiority of the proposed PAN. It is worth noting that our method can achieve a competitive F-measure of 79.9% at 84.2 FPS on CTW1500.
Tasks Scene Text Detection
Published 2019-08-16
URL https://arxiv.org/abs/1908.05900v1
PDF https://arxiv.org/pdf/1908.05900v1.pdf
PWC https://paperswithcode.com/paper/efficient-and-accurate-arbitrary-shaped-text
Repo https://github.com/insightcs/PAN-Pytorch
Framework pytorch
Title Square Attack: a query-efficient black-box adversarial attack via random search
Authors Maksym Andriushchenko, Francesco Croce, Nicolas Flammarion, Matthias Hein
Abstract We propose the Square Attack, a score-based black-box $l_2$- and $l_\infty$-adversarial attack that does not rely on local gradient information and thus is not affected by gradient masking. Square Attack is based on a randomized search scheme which selects localized square-shaped updates at random positions so that at each iteration the perturbation is situated approximately at the boundary of the feasible set. Our method is significantly more query efficient and achieves a higher success rate compared to the state-of-the-art methods, especially in the untargeted setting. In particular, on ImageNet we improve the average query efficiency in the untargeted setting for various deep networks by a factor of at least $1.8$ and up to $3$ compared to the recent state-of-the-art $l_\infty$-attack of Al-Dujaili & O’Reilly. Moreover, although our attack is black-box, it can also outperform gradient-based white-box attacks on the standard benchmarks achieving a new state-of-the-art in terms of the success rate. The code of our attack is available at https://github.com/max-andr/square-attack.
Tasks Adversarial Attack
Published 2019-11-29
URL https://arxiv.org/abs/1912.00049v2
PDF https://arxiv.org/pdf/1912.00049v2.pdf
PWC https://paperswithcode.com/paper/square-attack-a-query-efficient-black-box
Repo https://github.com/max-andr/square-attack
Framework tf

Intermittent Learning: On-Device Machine Learning on Intermittently Powered System

Title Intermittent Learning: On-Device Machine Learning on Intermittently Powered System
Authors Seulki Lee, Bashima Islam, Yubo Luo, Shahriar Nirjon
Abstract This paper introduces intermittent learning - the goal of which is to enable energy harvested computing platforms capable of executing certain classes of machine learning tasks effectively and efficiently. We identify unique challenges to intermittent learning relating to the data and application semantics of machine learning tasks, and to address these challenges, we devise 1) an algorithm that determines a sequence of actions to achieve the desired learning objective under tight energy constraints, and 2) propose three heuristics that help an intermittent learner decide whether to learn or discard training examples at run-time which increases the energy efficiency of the system. We implement and evaluate three intermittent learning applications that learn the 1) air quality, 2) human presence, and 3) vibration using solar, RF, and kinetic energy harvesters, respectively. We demonstrate that the proposed framework improves the energy efficiency of a learner by up to 100% and cuts down the number of learning examples by up to 50% when compared to state-of-the-art intermittent computing systems that do not implement the proposed intermittent learning framework.
Tasks
Published 2019-04-21
URL https://arxiv.org/abs/1904.09644v2
PDF https://arxiv.org/pdf/1904.09644v2.pdf
PWC https://paperswithcode.com/paper/intermittent-learning-on-device-machine
Repo https://github.com/learning1234embed/Intermittent-Learning
Framework none

Facebook FAIR’s WMT19 News Translation Task Submission

Title Facebook FAIR’s WMT19 News Translation Task Submission
Authors Nathan Ng, Kyra Yee, Alexei Baevski, Myle Ott, Michael Auli, Sergey Edunov
Abstract This paper describes Facebook FAIR’s submission to the WMT19 shared news translation task. We participate in two language pairs and four language directions, English <-> German and English <-> Russian. Following our submission from last year, our baseline systems are large BPE-based transformer models trained with the Fairseq sequence modeling toolkit which rely on sampled back-translations. This year we experiment with different bitext data filtering schemes, as well as with adding filtered back-translated data. We also ensemble and fine-tune our models on domain-specific data, then decode using noisy channel model reranking. Our submissions are ranked first in all four directions of the human evaluation campaign. On En->De, our system significantly outperforms other systems as well as human translations. This system improves upon our WMT’18 submission by 4.5 BLEU points.
Tasks Machine Translation
Published 2019-07-15
URL https://arxiv.org/abs/1907.06616v1
PDF https://arxiv.org/pdf/1907.06616v1.pdf
PWC https://paperswithcode.com/paper/facebook-fairs-wmt19-news-translation-task
Repo https://github.com/pytorch/fairseq/tree/master/examples/wmt19
Framework pytorch

Effective Cross-lingual Transfer of Neural Machine Translation Models without Shared Vocabularies

Title Effective Cross-lingual Transfer of Neural Machine Translation Models without Shared Vocabularies
Authors Yunsu Kim, Yingbo Gao, Hermann Ney
Abstract Transfer learning or multilingual model is essential for low-resource neural machine translation (NMT), but the applicability is limited to cognate languages by sharing their vocabularies. This paper shows effective techniques to transfer a pre-trained NMT model to a new, unrelated language without shared vocabularies. We relieve the vocabulary mismatch by using cross-lingual word embedding, train a more language-agnostic encoder by injecting artificial noises, and generate synthetic data easily from the pre-training data without back-translation. Our methods do not require restructuring the vocabulary or retraining the model. We improve plain NMT transfer by up to +5.1% BLEU in five low-resource translation tasks, outperforming multilingual joint training by a large margin. We also provide extensive ablation studies on pre-trained embedding, synthetic data, vocabulary size, and parameter freezing for a better understanding of NMT transfer.
Tasks Cross-Lingual Transfer, Low-Resource Neural Machine Translation, Machine Translation, Transfer Learning
Published 2019-05-14
URL https://arxiv.org/abs/1905.05475v2
PDF https://arxiv.org/pdf/1905.05475v2.pdf
PWC https://paperswithcode.com/paper/effective-cross-lingual-transfer-of-neural
Repo https://github.com/yunsukim86/sockeye-transfer
Framework mxnet

An Incremental Turn-Taking Model For Task-Oriented Dialog Systems

Title An Incremental Turn-Taking Model For Task-Oriented Dialog Systems
Authors Andrei C. Coman, Koichiro Yoshino, Yukitoshi Murase, Satoshi Nakamura, Giuseppe Riccardi
Abstract In a human-machine dialog scenario, deciding the appropriate time for the machine to take the turn is an open research problem. In contrast, humans engaged in conversations are able to timely decide when to interrupt the speaker for competitive or non-competitive reasons. In state-of-the-art turn-by-turn dialog systems the decision on the next dialog action is taken at the end of the utterance. In this paper, we propose a token-by-token prediction of the dialog state from incremental transcriptions of the user utterance. To identify the point of maximal understanding in an ongoing utterance, we a) implement an incremental Dialog State Tracker which is updated on a token basis (iDST) b) re-label the Dialog State Tracking Challenge 2 (DSTC2) dataset and c) adapt it to the incremental turn-taking experimental scenario. The re-labeling consists of assigning a binary value to each token in the user utterance that allows to identify the appropriate point for taking the turn. Finally, we implement an incremental Turn Taking Decider (iTTD) that is trained on these new labels for the turn-taking decision. We show that the proposed model can achieve a better performance compared to a deterministic handcrafted turn-taking algorithm.
Tasks
Published 2019-05-28
URL https://arxiv.org/abs/1905.11806v3
PDF https://arxiv.org/pdf/1905.11806v3.pdf
PWC https://paperswithcode.com/paper/an-incremental-turn-taking-model-for-task
Repo https://github.com/ahclab/iDST_iTTD
Framework none

Recurrent MVSNet for High-resolution Multi-view Stereo Depth Inference

Title Recurrent MVSNet for High-resolution Multi-view Stereo Depth Inference
Authors Yao Yao, Zixin Luo, Shiwei Li, Tianwei Shen, Tian Fang, Long Quan
Abstract Deep learning has recently demonstrated its excellent performance for multi-view stereo (MVS). However, one major limitation of current learned MVS approaches is the scalability: the memory-consuming cost volume regularization makes the learned MVS hard to be applied to high-resolution scenes. In this paper, we introduce a scalable multi-view stereo framework based on the recurrent neural network. Instead of regularizing the entire 3D cost volume in one go, the proposed Recurrent Multi-view Stereo Network (R-MVSNet) sequentially regularizes the 2D cost maps along the depth direction via the gated recurrent unit (GRU). This reduces dramatically the memory consumption and makes high-resolution reconstruction feasible. We first show the state-of-the-art performance achieved by the proposed R-MVSNet on the recent MVS benchmarks. Then, we further demonstrate the scalability of the proposed method on several large-scale scenarios, where previous learned approaches often fail due to the memory constraint. Code is available at https://github.com/YoYo000/MVSNet.
Tasks
Published 2019-02-27
URL http://arxiv.org/abs/1902.10556v1
PDF http://arxiv.org/pdf/1902.10556v1.pdf
PWC https://paperswithcode.com/paper/recurrent-mvsnet-for-high-resolution-multi
Repo https://github.com/YoYo000/MVSNet
Framework tf
comments powered by Disqus