Paper Group AWR 211
Side Window Filtering. On Binscatter. Rethinking Text Attribute Transfer: A Lexical Analysis. OpenSpiel: A Framework for Reinforcement Learning in Games. Certifiable Robustness and Robust Training for Graph Convolutional Networks. Rethinking Irregular Scene Text Recognition. Patch augmentation: Towards efficient decision boundaries for neural netwo …
Side Window Filtering
Title | Side Window Filtering |
Authors | Hui Yin, Yuanhao Gong, Guoping Qiu |
Abstract | Local windows are routinely used in computer vision and almost without exception the center of the window is aligned with the pixels being processed. We show that this conventional wisdom is not universally applicable. When a pixel is on an edge, placing the center of the window on the pixel is one of the fundamental reasons that cause many filtering algorithms to blur the edges. Based on this insight, we propose a new Side Window Filtering (SWF) technique which aligns the window’s side or corner with the pixel being processed. The SWF technique is surprisingly simple yet theoretically rooted and very effective in practice. We show that many traditional linear and nonlinear filters can be easily implemented under the SWF framework. Extensive analysis and experiments show that implementing the SWF principle can significantly improve their edge preserving capabilities and achieve state of the art performances in applications such as image smoothing, denoising, enhancement, structure-preserving texture-removing, mutual-structure extraction, and HDR tone mapping. In addition to image filtering, we further show that the SWF principle can be extended to other applications involving the use of a local window. Using colorization by optimization as an example, we demonstrate that implementing the SWF principle can effectively prevent artifacts such as color leakage associated with the conventional implementation. Given the ubiquity of window based operations in computer vision, the new SWF technique is likely to benefit many more applications. |
Tasks | Colorization, Denoising |
Published | 2019-05-17 |
URL | https://arxiv.org/abs/1905.07177v1 |
https://arxiv.org/pdf/1905.07177v1.pdf | |
PWC | https://paperswithcode.com/paper/side-window-filtering |
Repo | https://github.com/wang-kangkang/SideWindowFilter-pytorch |
Framework | pytorch |
On Binscatter
Title | On Binscatter |
Authors | Matias D. Cattaneo, Richard K. Crump, Max H. Farrell, Yingjie Feng |
Abstract | Binscatter is very popular in applied microeconomics. It provides a flexible, yet parsimonious way of visualizing and summarizing large data sets in regression settings, and it is often used for informal evaluation of substantive hypotheses such as linearity or monotonicity of the regression function. This paper presents a foundational, thorough analysis of binscatter: we give an array of theoretical and practical results that aid both in understanding current practices (i.e., their validity or lack thereof) and in offering theory-based guidance for future applications. Our main results include principled number of bins selection, confidence intervals and bands, hypothesis tests for parametric and shape restrictions of the regression function, and several other new methods, applicable to canonical binscatter as well as higher-order polynomial, covariate-adjusted and smoothness-restricted extensions thereof. In particular, we highlight important methodological problems related to covariate adjustment methods used in current practice. We also discuss extensions to clustered data. Our results are illustrated with simulated and real data throughout. Companion general-purpose software packages for \texttt{Stata} and \texttt{R} are provided. Finally, from a technical perspective, new theoretical results for partitioning-based series estimation are obtained that may be of independent interest. |
Tasks | |
Published | 2019-02-25 |
URL | http://arxiv.org/abs/1902.09608v1 |
http://arxiv.org/pdf/1902.09608v1.pdf | |
PWC | https://paperswithcode.com/paper/on-binscatter |
Repo | https://github.com/jiafengkevinchen/binscatterplot |
Framework | none |
Rethinking Text Attribute Transfer: A Lexical Analysis
Title | Rethinking Text Attribute Transfer: A Lexical Analysis |
Authors | Yao Fu, Hao Zhou, Jiaze Chen, Lei Li |
Abstract | Text attribute transfer is modifying certain linguistic attributes (e.g. sentiment, style, authorship, etc.) of a sentence and transforming them from one type to another. In this paper, we aim to analyze and interpret what is changed during the transfer process. We start from the observation that in many existing models and datasets, certain words within a sentence play important roles in determining the sentence attribute class. These words are referred to as \textit{the Pivot Words}. Based on these pivot words, we propose a lexical analysis framework, \textit{the Pivot Analysis}, to quantitatively analyze the effects of these words in text attribute classification and transfer. We apply this framework to existing datasets and models and show that: (1) the pivot words are strong features for the classification of sentence attributes; (2) to change the attribute of a sentence, many datasets only requires to change certain pivot words; (3) consequently, many transfer models only perform the lexical-level modification, while leaving higher-level sentence structures unchanged. Our work provides an in-depth understanding of linguistic attribute transfer and further identifies the future requirements and challenges of this task\footnote{Our code can be found at https://github.com/FranxYao/pivot_analysis}. |
Tasks | Lexical Analysis, Text Attribute Transfer |
Published | 2019-09-26 |
URL | https://arxiv.org/abs/1909.12335v1 |
https://arxiv.org/pdf/1909.12335v1.pdf | |
PWC | https://paperswithcode.com/paper/rethinking-text-attribute-transfer-a-lexical |
Repo | https://github.com/FranxYao/pivot_analysis |
Framework | none |
OpenSpiel: A Framework for Reinforcement Learning in Games
Title | OpenSpiel: A Framework for Reinforcement Learning in Games |
Authors | Marc Lanctot, Edward Lockhart, Jean-Baptiste Lespiau, Vinicius Zambaldi, Satyaki Upadhyay, Julien Pérolat, Sriram Srinivasan, Finbarr Timbers, Karl Tuyls, Shayegan Omidshafiei, Daniel Hennes, Dustin Morrill, Paul Muller, Timo Ewalds, Ryan Faulkner, János Kramár, Bart De Vylder, Brennan Saeta, James Bradbury, David Ding, Sebastian Borgeaud, Matthew Lai, Julian Schrittwieser, Thomas Anthony, Edward Hughes, Ivo Danihelka, Jonah Ryan-Davis |
Abstract | OpenSpiel is a collection of environments and algorithms for research in general reinforcement learning and search/planning in games. OpenSpiel supports n-player (single- and multi- agent) zero-sum, cooperative and general-sum, one-shot and sequential, strictly turn-taking and simultaneous-move, perfect and imperfect information games, as well as traditional multiagent environments such as (partially- and fully- observable) grid worlds and social dilemmas. OpenSpiel also includes tools to analyze learning dynamics and other common evaluation metrics. This document serves both as an overview of the code base and an introduction to the terminology, core concepts, and algorithms across the fields of reinforcement learning, computational game theory, and search. |
Tasks | |
Published | 2019-08-26 |
URL | https://arxiv.org/abs/1908.09453v5 |
https://arxiv.org/pdf/1908.09453v5.pdf | |
PWC | https://paperswithcode.com/paper/openspiel-a-framework-for-reinforcement |
Repo | https://github.com/deepmind/open_spiel |
Framework | none |
Certifiable Robustness and Robust Training for Graph Convolutional Networks
Title | Certifiable Robustness and Robust Training for Graph Convolutional Networks |
Authors | Daniel Zügner, Stephan Günnemann |
Abstract | Recent works show that Graph Neural Networks (GNNs) are highly non-robust with respect to adversarial attacks on both the graph structure and the node attributes, making their outcomes unreliable. We propose the first method for certifiable (non-)robustness of graph convolutional networks with respect to perturbations of the node attributes. We consider the case of binary node attributes (e.g. bag-of-words) and perturbations that are L_0-bounded. If a node has been certified with our method, it is guaranteed to be robust under any possible perturbation given the attack model. Likewise, we can certify non-robustness. Finally, we propose a robust semi-supervised training procedure that treats the labeled and unlabeled nodes jointly. As shown in our experimental evaluation, our method significantly improves the robustness of the GNN with only minimal effect on the predictive accuracy. |
Tasks | Node Classification |
Published | 2019-06-28 |
URL | https://arxiv.org/abs/1906.12269v1 |
https://arxiv.org/pdf/1906.12269v1.pdf | |
PWC | https://paperswithcode.com/paper/certifiable-robustness-and-robust-training |
Repo | https://github.com/danielzuegner/robust-gcn |
Framework | pytorch |
Rethinking Irregular Scene Text Recognition
Title | Rethinking Irregular Scene Text Recognition |
Authors | Shangbang Long, Yushuo Guan, Bingxuan Wang, Kaigui Bian, Cong Yao |
Abstract | Reading text from natural images is challenging due to the great variety in text font, color, size, complex background and etc.. The perspective distortion and non-linear spatial arrangement of characters make it further difficult. While rectification based method is intuitively grounded and has pushed the envelope by far, its potential is far from being well exploited. In this paper, we present a bag of tricks that prove to significantly improve the performance of rectification based method. On curved text dataset, our method achieves an accuracy of 89.6% on CUTE-80 and 76.3% on Total-Text, an improvement over previous state-of-the-art by 6.3% and 14.7% respectively. Furthermore, our combination of tricks helps us win the ICDAR 2019 Arbitrary-Shaped Text Challenge (Latin script), achieving an accuracy of 74.3% on the held-out test set. We release our code as well as data samples for further exploration at https://github.com/Jyouhou/ICDAR2019-ArT-Recognition-Alchemy |
Tasks | Scene Text Detection |
Published | 2019-08-30 |
URL | https://arxiv.org/abs/1908.11834v2 |
https://arxiv.org/pdf/1908.11834v2.pdf | |
PWC | https://paperswithcode.com/paper/alchemy-techniques-for-rectification-based |
Repo | https://github.com/Jyouhou/ICDAR2019-ArT-Recognition-Alchemy |
Framework | pytorch |
Patch augmentation: Towards efficient decision boundaries for neural networks
Title | Patch augmentation: Towards efficient decision boundaries for neural networks |
Authors | Marcus D. Bloice, Peter M. Roth, Andreas Holzinger |
Abstract | In this paper we propose a new augmentation technique, called patch augmentation, that, in our experiments, improves model accuracy and makes networks more robust to adversarial attacks. In brief, this data-independent approach creates new image data based on image/label pairs, where a patch from one of the two images in the pair is superimposed on to the other image, creating a new augmented sample. The new image’s label is a linear combination of the image pair’s corresponding labels. Initial experiments show a several percentage point increase in accuracy on CIFAR-10, from a baseline of approximately 81% to 89%. CIFAR-100 sees larger improvements still, from a baseline of 52% to 68% accuracy. Networks trained using patch augmentation are also more robust to adversarial attacks, which we demonstrate using the Fast Gradient Sign Method. |
Tasks | Adversarial Attack |
Published | 2019-11-08 |
URL | https://arxiv.org/abs/1911.07922v2 |
https://arxiv.org/pdf/1911.07922v2.pdf | |
PWC | https://paperswithcode.com/paper/patch-augmentation-towards-efficient-decision |
Repo | https://github.com/mdbloice/Patch-Augmentation |
Framework | none |
Capsule Neural Networks for Graph Classification using Explicit Tensorial Graph Representations
Title | Capsule Neural Networks for Graph Classification using Explicit Tensorial Graph Representations |
Authors | Marcelo Daniel Gutierrez Mallea, Peter Meltzer, Peter J Bentley |
Abstract | Graph classification is a significant problem in many scientific domains. It addresses tasks such as the classification of proteins and chemical compounds into categories according to their functions, or chemical and structural properties. In a supervised setting, this problem can be framed as learning the structure, features and relationships between features within a set of labelled graphs and being able to correctly predict the labels or categories of unseen graphs. A significant difficulty in this task arises when attempting to apply established classification algorithms due to the requirement for fixed size matrix or tensor representations of the graphs which may vary greatly in their numbers of nodes and edges. Building on prior work combining explicit tensor representations with a standard image-based classifier, we propose a model to perform graph classification by extracting fixed size tensorial information from each graph in a given set, and using a Capsule Network to perform classification. The graphs we consider here are undirected and with categorical features on the nodes. Using standard benchmarking chemical and protein datasets, we demonstrate that our graph Capsule Network classification model using an explicit tensorial representation of the graphs is competitive with current state of the art graph kernels and graph neural network models despite only limited hyper-parameter searching. |
Tasks | Graph Classification |
Published | 2019-02-22 |
URL | http://arxiv.org/abs/1902.08399v1 |
http://arxiv.org/pdf/1902.08399v1.pdf | |
PWC | https://paperswithcode.com/paper/capsule-neural-networks-for-graph |
Repo | https://github.com/BraintreeLtd/PatchyCapsules |
Framework | none |
Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network
Title | Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network |
Authors | Wenhai Wang, Enze Xie, Xiaoge Song, Yuhang Zang, Wenjia Wang, Tong Lu, Gang Yu, Chunhua Shen |
Abstract | Scene text detection, an important step of scene text reading systems, has witnessed rapid development with convolutional neural networks. Nonetheless, two main challenges still exist and hamper its deployment to real-world applications. The first problem is the trade-off between speed and accuracy. The second one is to model the arbitrary-shaped text instance. Recently, some methods have been proposed to tackle arbitrary-shaped text detection, but they rarely take the speed of the entire pipeline into consideration, which may fall short in practical applications.In this paper, we propose an efficient and accurate arbitrary-shaped text detector, termed Pixel Aggregation Network (PAN), which is equipped with a low computational-cost segmentation head and a learnable post-processing. More specifically, the segmentation head is made up of Feature Pyramid Enhancement Module (FPEM) and Feature Fusion Module (FFM). FPEM is a cascadable U-shaped module, which can introduce multi-level information to guide the better segmentation. FFM can gather the features given by the FPEMs of different depths into a final feature for segmentation. The learnable post-processing is implemented by Pixel Aggregation (PA), which can precisely aggregate text pixels by predicted similarity vectors. Experiments on several standard benchmarks validate the superiority of the proposed PAN. It is worth noting that our method can achieve a competitive F-measure of 79.9% at 84.2 FPS on CTW1500. |
Tasks | Scene Text Detection |
Published | 2019-08-16 |
URL | https://arxiv.org/abs/1908.05900v1 |
https://arxiv.org/pdf/1908.05900v1.pdf | |
PWC | https://paperswithcode.com/paper/efficient-and-accurate-arbitrary-shaped-text |
Repo | https://github.com/insightcs/PAN-Pytorch |
Framework | pytorch |
Square Attack: a query-efficient black-box adversarial attack via random search
Title | Square Attack: a query-efficient black-box adversarial attack via random search |
Authors | Maksym Andriushchenko, Francesco Croce, Nicolas Flammarion, Matthias Hein |
Abstract | We propose the Square Attack, a score-based black-box $l_2$- and $l_\infty$-adversarial attack that does not rely on local gradient information and thus is not affected by gradient masking. Square Attack is based on a randomized search scheme which selects localized square-shaped updates at random positions so that at each iteration the perturbation is situated approximately at the boundary of the feasible set. Our method is significantly more query efficient and achieves a higher success rate compared to the state-of-the-art methods, especially in the untargeted setting. In particular, on ImageNet we improve the average query efficiency in the untargeted setting for various deep networks by a factor of at least $1.8$ and up to $3$ compared to the recent state-of-the-art $l_\infty$-attack of Al-Dujaili & O’Reilly. Moreover, although our attack is black-box, it can also outperform gradient-based white-box attacks on the standard benchmarks achieving a new state-of-the-art in terms of the success rate. The code of our attack is available at https://github.com/max-andr/square-attack. |
Tasks | Adversarial Attack |
Published | 2019-11-29 |
URL | https://arxiv.org/abs/1912.00049v2 |
https://arxiv.org/pdf/1912.00049v2.pdf | |
PWC | https://paperswithcode.com/paper/square-attack-a-query-efficient-black-box |
Repo | https://github.com/max-andr/square-attack |
Framework | tf |
Intermittent Learning: On-Device Machine Learning on Intermittently Powered System
Title | Intermittent Learning: On-Device Machine Learning on Intermittently Powered System |
Authors | Seulki Lee, Bashima Islam, Yubo Luo, Shahriar Nirjon |
Abstract | This paper introduces intermittent learning - the goal of which is to enable energy harvested computing platforms capable of executing certain classes of machine learning tasks effectively and efficiently. We identify unique challenges to intermittent learning relating to the data and application semantics of machine learning tasks, and to address these challenges, we devise 1) an algorithm that determines a sequence of actions to achieve the desired learning objective under tight energy constraints, and 2) propose three heuristics that help an intermittent learner decide whether to learn or discard training examples at run-time which increases the energy efficiency of the system. We implement and evaluate three intermittent learning applications that learn the 1) air quality, 2) human presence, and 3) vibration using solar, RF, and kinetic energy harvesters, respectively. We demonstrate that the proposed framework improves the energy efficiency of a learner by up to 100% and cuts down the number of learning examples by up to 50% when compared to state-of-the-art intermittent computing systems that do not implement the proposed intermittent learning framework. |
Tasks | |
Published | 2019-04-21 |
URL | https://arxiv.org/abs/1904.09644v2 |
https://arxiv.org/pdf/1904.09644v2.pdf | |
PWC | https://paperswithcode.com/paper/intermittent-learning-on-device-machine |
Repo | https://github.com/learning1234embed/Intermittent-Learning |
Framework | none |
Facebook FAIR’s WMT19 News Translation Task Submission
Title | Facebook FAIR’s WMT19 News Translation Task Submission |
Authors | Nathan Ng, Kyra Yee, Alexei Baevski, Myle Ott, Michael Auli, Sergey Edunov |
Abstract | This paper describes Facebook FAIR’s submission to the WMT19 shared news translation task. We participate in two language pairs and four language directions, English <-> German and English <-> Russian. Following our submission from last year, our baseline systems are large BPE-based transformer models trained with the Fairseq sequence modeling toolkit which rely on sampled back-translations. This year we experiment with different bitext data filtering schemes, as well as with adding filtered back-translated data. We also ensemble and fine-tune our models on domain-specific data, then decode using noisy channel model reranking. Our submissions are ranked first in all four directions of the human evaluation campaign. On En->De, our system significantly outperforms other systems as well as human translations. This system improves upon our WMT’18 submission by 4.5 BLEU points. |
Tasks | Machine Translation |
Published | 2019-07-15 |
URL | https://arxiv.org/abs/1907.06616v1 |
https://arxiv.org/pdf/1907.06616v1.pdf | |
PWC | https://paperswithcode.com/paper/facebook-fairs-wmt19-news-translation-task |
Repo | https://github.com/pytorch/fairseq/tree/master/examples/wmt19 |
Framework | pytorch |
Effective Cross-lingual Transfer of Neural Machine Translation Models without Shared Vocabularies
Title | Effective Cross-lingual Transfer of Neural Machine Translation Models without Shared Vocabularies |
Authors | Yunsu Kim, Yingbo Gao, Hermann Ney |
Abstract | Transfer learning or multilingual model is essential for low-resource neural machine translation (NMT), but the applicability is limited to cognate languages by sharing their vocabularies. This paper shows effective techniques to transfer a pre-trained NMT model to a new, unrelated language without shared vocabularies. We relieve the vocabulary mismatch by using cross-lingual word embedding, train a more language-agnostic encoder by injecting artificial noises, and generate synthetic data easily from the pre-training data without back-translation. Our methods do not require restructuring the vocabulary or retraining the model. We improve plain NMT transfer by up to +5.1% BLEU in five low-resource translation tasks, outperforming multilingual joint training by a large margin. We also provide extensive ablation studies on pre-trained embedding, synthetic data, vocabulary size, and parameter freezing for a better understanding of NMT transfer. |
Tasks | Cross-Lingual Transfer, Low-Resource Neural Machine Translation, Machine Translation, Transfer Learning |
Published | 2019-05-14 |
URL | https://arxiv.org/abs/1905.05475v2 |
https://arxiv.org/pdf/1905.05475v2.pdf | |
PWC | https://paperswithcode.com/paper/effective-cross-lingual-transfer-of-neural |
Repo | https://github.com/yunsukim86/sockeye-transfer |
Framework | mxnet |
An Incremental Turn-Taking Model For Task-Oriented Dialog Systems
Title | An Incremental Turn-Taking Model For Task-Oriented Dialog Systems |
Authors | Andrei C. Coman, Koichiro Yoshino, Yukitoshi Murase, Satoshi Nakamura, Giuseppe Riccardi |
Abstract | In a human-machine dialog scenario, deciding the appropriate time for the machine to take the turn is an open research problem. In contrast, humans engaged in conversations are able to timely decide when to interrupt the speaker for competitive or non-competitive reasons. In state-of-the-art turn-by-turn dialog systems the decision on the next dialog action is taken at the end of the utterance. In this paper, we propose a token-by-token prediction of the dialog state from incremental transcriptions of the user utterance. To identify the point of maximal understanding in an ongoing utterance, we a) implement an incremental Dialog State Tracker which is updated on a token basis (iDST) b) re-label the Dialog State Tracking Challenge 2 (DSTC2) dataset and c) adapt it to the incremental turn-taking experimental scenario. The re-labeling consists of assigning a binary value to each token in the user utterance that allows to identify the appropriate point for taking the turn. Finally, we implement an incremental Turn Taking Decider (iTTD) that is trained on these new labels for the turn-taking decision. We show that the proposed model can achieve a better performance compared to a deterministic handcrafted turn-taking algorithm. |
Tasks | |
Published | 2019-05-28 |
URL | https://arxiv.org/abs/1905.11806v3 |
https://arxiv.org/pdf/1905.11806v3.pdf | |
PWC | https://paperswithcode.com/paper/an-incremental-turn-taking-model-for-task |
Repo | https://github.com/ahclab/iDST_iTTD |
Framework | none |
Recurrent MVSNet for High-resolution Multi-view Stereo Depth Inference
Title | Recurrent MVSNet for High-resolution Multi-view Stereo Depth Inference |
Authors | Yao Yao, Zixin Luo, Shiwei Li, Tianwei Shen, Tian Fang, Long Quan |
Abstract | Deep learning has recently demonstrated its excellent performance for multi-view stereo (MVS). However, one major limitation of current learned MVS approaches is the scalability: the memory-consuming cost volume regularization makes the learned MVS hard to be applied to high-resolution scenes. In this paper, we introduce a scalable multi-view stereo framework based on the recurrent neural network. Instead of regularizing the entire 3D cost volume in one go, the proposed Recurrent Multi-view Stereo Network (R-MVSNet) sequentially regularizes the 2D cost maps along the depth direction via the gated recurrent unit (GRU). This reduces dramatically the memory consumption and makes high-resolution reconstruction feasible. We first show the state-of-the-art performance achieved by the proposed R-MVSNet on the recent MVS benchmarks. Then, we further demonstrate the scalability of the proposed method on several large-scale scenarios, where previous learned approaches often fail due to the memory constraint. Code is available at https://github.com/YoYo000/MVSNet. |
Tasks | |
Published | 2019-02-27 |
URL | http://arxiv.org/abs/1902.10556v1 |
http://arxiv.org/pdf/1902.10556v1.pdf | |
PWC | https://paperswithcode.com/paper/recurrent-mvsnet-for-high-resolution-multi |
Repo | https://github.com/YoYo000/MVSNet |
Framework | tf |