February 1, 2020

3032 words 15 mins read

Paper Group AWR 342

Paper Group AWR 342

Investigating Self-Attention Network for Chinese Word Segmentation. A Graph-based Model for Joint Chinese Word Segmentation and Dependency Parsing. SuperTML: Two-Dimensional Word Embedding for the Precognition on Structured Tabular Data. Weighted Linear Bandits for Non-Stationary Environments. ReMASC: Realistic Replay Attack Corpus for Voice Contro …

Investigating Self-Attention Network for Chinese Word Segmentation

Title Investigating Self-Attention Network for Chinese Word Segmentation
Authors Leilei Gan, Yue Zhang
Abstract Neural network has become the dominant method for Chinese word segmentation. Most existing models cast the task as sequence labeling, using BiLSTM-CRF for representing the input and making output predictions. Recently, attention-based sequence models have emerged as a highly competitive alternative to LSTMs, which allow better running speed by parallelization of computation. We investigate self attention network for Chinese word segmentation, making comparisons between BiLSTM-CRF models. In addition, the influence of contextualized character embeddings is investigated using BERT, and a method is proposed for integrating word information into SAN segmentation. Results show that SAN gives highly competitive results compared with BiLSTMs, with BERT and word information further improving segmentation for in-domain and cross-domain segmentation. Our final models give the best results for 6 heterogenous domain benchmarks.
Tasks Chinese Word Segmentation
Published 2019-07-26
URL https://arxiv.org/abs/1907.11512v1
PDF https://arxiv.org/pdf/1907.11512v1.pdf
PWC https://paperswithcode.com/paper/investigating-self-attention-network-for
Repo https://github.com/gump88/SAN-CWS
Framework pytorch

A Graph-based Model for Joint Chinese Word Segmentation and Dependency Parsing

Title A Graph-based Model for Joint Chinese Word Segmentation and Dependency Parsing
Authors Hang Yan, Xipeng Qiu, Xuanjing Huang
Abstract Chinese word segmentation and dependency parsing are two fundamental tasks for Chinese natural language processing. The dependency parsing is defined on word-level. Therefore word segmentation is the precondition of dependency parsing, which makes dependency parsing suffer from error propagation and unable to directly make use of the character-level pre-trained language model (such as BERT). In this paper, we propose a graph-based model to integrate Chinese word segmentation and dependency parsing. Different from previous transition-based joint models, our proposed model is more concise, which results in fewer efforts of feature engineering. Our graph-based joint model achieves better performance than previous joint models and state-of-the-art results in both Chinese word segmentation and dependency parsing. Besides, when BERT is combined, our model can substantially reduce the performance gap of dependency parsing between joint models and gold-segmented word-based models. Our code is publicly available at https://github.com/fastnlp/JointCwsParser.
Tasks Chinese Word Segmentation, Dependency Parsing, Feature Engineering, Language Modelling
Published 2019-04-09
URL https://arxiv.org/abs/1904.04697v2
PDF https://arxiv.org/pdf/1904.04697v2.pdf
PWC https://paperswithcode.com/paper/a-unified-model-for-joint-chinese-word
Repo https://github.com/fastnlp/JointCwsParser
Framework pytorch

SuperTML: Two-Dimensional Word Embedding for the Precognition on Structured Tabular Data

Title SuperTML: Two-Dimensional Word Embedding for the Precognition on Structured Tabular Data
Authors Baohua Sun, Lin Yang, Wenhan Zhang, Michael Lin, Patrick Dong, Charles Young, Jason Dong
Abstract Tabular data is the most commonly used form of data in industry. Gradient Boosting Trees, Support Vector Machine, Random Forest, and Logistic Regression are typically used for classification tasks on tabular data. DNN models using categorical embeddings are also applied in this task, but all attempts thus far have used one-dimensional embeddings. The recent work of Super Characters method using two-dimensional word embeddings achieved the state of art result in text classification tasks, showcasing the promise of this new approach. In this paper, we propose the SuperTML method, which borrows the idea of Super Characters method and two-dimensional embeddings to address the problem of classification on tabular data. For each input of tabular data, the features are first projected into two-dimensional embeddings like an image, and then this image is fed into fine-tuned two-dimensional CNN models for classification. Experimental results have shown that the proposed SuperTML method had achieved state-of-the-art results on both large and small datasets.
Tasks Text Classification, Transfer Learning, Word Embeddings
Published 2019-02-26
URL https://arxiv.org/abs/1903.06246v3
PDF https://arxiv.org/pdf/1903.06246v3.pdf
PWC https://paperswithcode.com/paper/supertml-two-dimensional-word-embedding-and
Repo https://github.com/SSinyu/p
Framework tf

Weighted Linear Bandits for Non-Stationary Environments

Title Weighted Linear Bandits for Non-Stationary Environments
Authors Yoan Russac, Claire Vernade, Olivier Cappé
Abstract We consider a stochastic linear bandit model in which the available actions correspond to arbitrary context vectors whose associated rewards follow a non-stationary linear regression model. In this setting, the unknown regression parameter is allowed to vary in time. To address this problem, we propose D-LinUCB, a novel optimistic algorithm based on discounted linear regression, where exponential weights are used to smoothly forget the past. This involves studying the deviations of the sequential weighted least-squares estimator under generic assumptions. As a by-product, we obtain novel deviation results that can be used beyond non-stationary environments. We provide theoretical guarantees on the behavior of D-LinUCB in both slowly-varying and abruptly-changing environments. We obtain an upper bound on the dynamic regret that is of order d^{2/3} B_T^{1/3}T^{2/3}, where B_T is a measure of non-stationarity (d and T being, respectively, dimension and horizon). This rate is known to be optimal. We also illustrate the empirical performance of D-LinUCB and compare it with recently proposed alternatives in simulated environments.
Tasks
Published 2019-09-19
URL https://arxiv.org/abs/1909.09146v2
PDF https://arxiv.org/pdf/1909.09146v2.pdf
PWC https://paperswithcode.com/paper/weighted-linear-bandits-for-non-stationary
Repo https://github.com/YRussac/WeightedLinearBandits
Framework none

ReMASC: Realistic Replay Attack Corpus for Voice Controlled Systems

Title ReMASC: Realistic Replay Attack Corpus for Voice Controlled Systems
Authors Yuan Gong, Jian Yang, Jacob Huber, Mitchell MacKnight, Christian Poellabauer
Abstract This paper introduces a new database of voice recordings with the goal of supporting research on vulnerabilities and protection of voice-controlled systems (VCSs). In contrast to prior efforts, the proposed database contains both genuine voice commands and replayed recordings of such commands, collected in realistic VCSs usage scenarios and using modern voice assistant development kits. Specifically, the database contains recordings from four systems (each with a different microphone array) in a variety of environmental conditions with different forms of background noise and relative positions between speaker and device. To the best of our knowledge, this is the first publicly available database that has been specifically designed for the protection of state-of-the-art voice-controlled systems against various replay attacks in various conditions and environments.
Tasks
Published 2019-04-06
URL https://arxiv.org/abs/1904.03365v2
PDF https://arxiv.org/pdf/1904.03365v2.pdf
PWC https://paperswithcode.com/paper/remasc-realistic-replay-attack-corpus-for
Repo https://github.com/YuanGongND/ReMASC
Framework none

SC-FEGAN: Face Editing Generative Adversarial Network with User’s Sketch and Color

Title SC-FEGAN: Face Editing Generative Adversarial Network with User’s Sketch and Color
Authors Youngjoo Jo, Jongyoul Park
Abstract We present a novel image editing system that generates images as the user provides free-form mask, sketch and color as an input. Our system consist of a end-to-end trainable convolutional network. Contrary to the existing methods, our system wholly utilizes free-form user input with color and shape. This allows the system to respond to the user’s sketch and color input, using it as a guideline to generate an image. In our particular work, we trained network with additional style loss which made it possible to generate realistic results, despite large portions of the image being removed. Our proposed network architecture SC-FEGAN is well suited to generate high quality synthetic image using intuitive user inputs.
Tasks Facial Inpainting, Image Inpainting
Published 2019-02-18
URL http://arxiv.org/abs/1902.06838v1
PDF http://arxiv.org/pdf/1902.06838v1.pdf
PWC https://paperswithcode.com/paper/sc-fegan-face-editing-generative-adversarial
Repo https://github.com/run-youngjoo/SC-FEGAN
Framework tf

Non-Stationary Markov Decision Processes, a Worst-Case Approach using Model-Based Reinforcement Learning, Extended version

Title Non-Stationary Markov Decision Processes, a Worst-Case Approach using Model-Based Reinforcement Learning, Extended version
Authors Erwan Lecarpentier, Emmanuel Rachelson
Abstract This work tackles the problem of robust zero-shot planning in non-stationary stochastic environments. We study Markov Decision Processes (MDPs) evolving over time and consider Model-Based Reinforcement Learning algorithms in this setting. We make two hypotheses: 1) the environment evolves continuously with a bounded evolution rate; 2) a current model is known at each decision epoch but not its evolution. Our contribution can be presented in four points. 1) we define a specific class of MDPs that we call Non-Stationary MDPs (NSMDPs). We introduce the notion of regular evolution by making an hypothesis of Lipschitz-Continuity on the transition and reward functions w.r.t. time; 2) we consider a planning agent using the current model of the environment but unaware of its future evolution. This leads us to consider a worst-case method where the environment is seen as an adversarial agent; 3) following this approach, we propose the Risk-Averse Tree-Search (RATS) algorithm, a zero-shot Model-Based method similar to Minimax search; 4) we illustrate the benefits brought by RATS empirically and compare its performance with reference Model-Based algorithms.
Tasks
Published 2019-04-22
URL https://arxiv.org/abs/1904.10090v4
PDF https://arxiv.org/pdf/1904.10090v4.pdf
PWC https://paperswithcode.com/paper/non-stationary-markov-decision-processes-a
Repo https://github.com/SuReLI/rats-experiments
Framework none

A Topological Loss Function for Deep-Learning based Image Segmentation using Persistent Homology

Title A Topological Loss Function for Deep-Learning based Image Segmentation using Persistent Homology
Authors James R. Clough, Ilkay Oksuz, Nicholas Byrne, Veronika A. Zimmer, Julia A. Schnabel, Andrew P. King
Abstract We introduce a method for training neural networks to perform image or volume segmentation in which prior knowledge about the topology of the segmented object can be explicitly provided and then incorporated into the training process. By using the differentiable properties of persistent homology, a concept used in topological data analysis, we can specify the desired topology of segmented objects in terms of their Betti numbers and then drive the proposed segmentations to contain the specified topological features. Importantly this process does not require any ground-truth labels, just prior knowledge of the topology of the structure being segmented. We demonstrate our approach in three experiments. Firstly we create a synthetic task in which handwritten MNIST digits are de-noised, and show that using this kind of topological prior knowledge in the training of the network significantly improves the quality of the de-noised digits. Secondly we perform an experiment in which the task is segmenting the myocardium of the left ventricle from cardiac magnetic resonance images. We show that the incorporation of the prior knowledge of the topology of this anatomy improves the resulting segmentations in terms of both the topological accuracy and the Dice coefficient. Thirdly, we extend the method to 3D volumes and demonstrate its performance on the task of segmenting the placenta from ultrasound data, again showing that incorporating topological priors improves performance on this challenging task. We find that embedding explicit prior knowledge in neural network segmentation tasks is most beneficial when the segmentation task is especially challenging and that it can be used in either a semi-supervised or post-processing context to extract a useful training gradient from images without pixelwise labels.
Tasks Semantic Segmentation, Topological Data Analysis
Published 2019-10-04
URL https://arxiv.org/abs/1910.01877v1
PDF https://arxiv.org/pdf/1910.01877v1.pdf
PWC https://paperswithcode.com/paper/a-topological-loss-function-for-deep-learning
Repo https://github.com/yuki3-18/Topological-DVAE
Framework pytorch

Graph Neural Ordinary Differential Equations

Title Graph Neural Ordinary Differential Equations
Authors Michael Poli, Stefano Massaroli, Junyoung Park, Atsushi Yamashita, Hajime Asama, Jinkyoo Park
Abstract We introduce the framework of continuous–depth graph neural networks (GNNs). Graph neural ordinary differential equations (GDEs) are formalized as the counterpart to GNNs where the input-output relationship is determined by a continuum of GNN layers, blending discrete topological structures and differential equations. The proposed framework is shown to be compatible with various static and autoregressive GNN models. Results prove general effectiveness of GDEs: in static settings they offer computational advantages by incorporating numerical methods in their forward pass; in dynamic settings, on the other hand, they are shown to improve performance by exploiting the geometry of the underlying dynamics.
Tasks
Published 2019-11-18
URL https://arxiv.org/abs/1911.07532v2
PDF https://arxiv.org/pdf/1911.07532v2.pdf
PWC https://paperswithcode.com/paper/graph-neural-ordinary-differential-equations
Repo https://github.com/Zymrael/gde
Framework pytorch

Multigrid Predictive Filter Flow for Unsupervised Learning on Videos

Title Multigrid Predictive Filter Flow for Unsupervised Learning on Videos
Authors Shu Kong, Charless Fowlkes
Abstract We introduce multigrid Predictive Filter Flow (mgPFF), a framework for unsupervised learning on videos. The mgPFF takes as input a pair of frames and outputs per-pixel filters to warp one frame to the other. Compared to optical flow used for warping frames, mgPFF is more powerful in modeling sub-pixel movement and dealing with corruption (e.g., motion blur). We develop a multigrid coarse-to-fine modeling strategy that avoids the requirement of learning large filters to capture large displacement. This allows us to train an extremely compact model (4.6MB) which operates in a progressive way over multiple resolutions with shared weights. We train mgPFF on unsupervised, free-form videos and show that mgPFF is able to not only estimate long-range flow for frame reconstruction and detect video shot transitions, but also readily amendable for video object segmentation and pose tracking, where it substantially outperforms the published state-of-the-art without bells and whistles. Moreover, owing to mgPFF’s nature of per-pixel filter prediction, we have the unique opportunity to visualize how each pixel is evolving during solving these tasks, thus gaining better interpretability.
Tasks Optical Flow Estimation, Pose Tracking, Semantic Segmentation, Skeleton Based Action Recognition, Video Object Segmentation, Video Semantic Segmentation
Published 2019-04-02
URL http://arxiv.org/abs/1904.01693v1
PDF http://arxiv.org/pdf/1904.01693v1.pdf
PWC https://paperswithcode.com/paper/multigrid-predictive-filter-flow-for
Repo https://github.com/bestaar/predictiveFilterFlow
Framework none

Regularizing Neural Networks by Stochastically Training Layer Ensembles

Title Regularizing Neural Networks by Stochastically Training Layer Ensembles
Authors Alex Labach, Shahrokh Valaee
Abstract Dropout and similar stochastic neural network regularization methods are often interpreted as implicitly averaging over a large ensemble of models. We propose STE (stochastically trained ensemble) layers, which enhance the averaging properties of such methods by training an ensemble of weight matrices with stochastic regularization while explicitly averaging outputs. This provides stronger regularization with no additional computational cost at test time. We show consistent improvement on various image classification tasks using standard network topologies.
Tasks Image Classification
Published 2019-11-21
URL https://arxiv.org/abs/1911.09669v1
PDF https://arxiv.org/pdf/1911.09669v1.pdf
PWC https://paperswithcode.com/paper/regularizing-neural-networks-by
Repo https://github.com/j201/keras-ste-layers
Framework none

Weakly Aligned Cross-Modal Learning for Multispectral Pedestrian Detection

Title Weakly Aligned Cross-Modal Learning for Multispectral Pedestrian Detection
Authors Lu Zhang, Xiangyu Zhu, Xiangyu Chen, Xu Yang, Zhen Lei, Zhiyong Liu
Abstract Multispectral pedestrian detection has shown great advantages under poor illumination conditions, since the thermal modality provides complementary information for the color image. However, real multispectral data suffers from the position shift problem, i.e. the color-thermal image pairs are not strictly aligned, making one object has different positions in different modalities. In deep learning based methods, this problem makes it difficult to fuse the feature maps from both modalities and puzzles the CNN training. In this paper, we propose a novel Aligned Region CNN (AR-CNN) to handle the weakly aligned multispectral data in an end-to-end way. Firstly, we design a Region Feature Alignment (RFA) module to capture the position shift and adaptively align the region features of the two modalities. Secondly, we present a new multimodal fusion method, which performs feature re-weighting to select more reliable features and suppress the useless ones. Besides, we propose a novel RoI jitter strategy to improve the robustness to unexpected shift patterns of different devices and system settings. Finally, since our method depends on a new kind of labelling: bounding boxes that match each modality, we manually relabel the KAIST dataset by locating bounding boxes in both modalities and building their relationships, providing a new KAIST-Paired Annotation. Extensive experimental validations on existing datasets are performed, demonstrating the effectiveness and robustness of the proposed method. Code and data are available at https://github.com/luzhang16/AR-CNN.
Tasks Pedestrian Detection
Published 2019-01-09
URL https://arxiv.org/abs/1901.02645v2
PDF https://arxiv.org/pdf/1901.02645v2.pdf
PWC https://paperswithcode.com/paper/the-cross-modality-disparity-problem-in
Repo https://github.com/luzhang16/AR-CNN
Framework none

Scalable Topological Data Analysis and Visualization for Evaluating Data-Driven Models in Scientific Applications

Title Scalable Topological Data Analysis and Visualization for Evaluating Data-Driven Models in Scientific Applications
Authors Shusen Liu, Di Wang, Dan Maljovec, Rushil Anirudh, Jayaraman J. Thiagarajan, Sam Ade Jacobs, Brian C. Van Essen, David Hysom, Jae-Seung Yeom, Jim Gaffney, Luc Peterson, Peter B. Robinson, Harsh Bhatia, Valerio Pascucci, Brian K. Spears, Peer-Timo Bremer
Abstract With the rapid adoption of machine learning techniques for large-scale applications in science and engineering comes the convergence of two grand challenges in visualization. First, the utilization of black box models (e.g., deep neural networks) calls for advanced techniques in exploring and interpreting model behaviors. Second, the rapid growth in computing has produced enormous datasets that require techniques that can handle millions or more samples. Although some solutions to these interpretability challenges have been proposed, they typically do not scale beyond thousands of samples, nor do they provide the high-level intuition scientists are looking for. Here, we present the first scalable solution to explore and analyze high-dimensional functions often encountered in the scientific data analysis pipeline. By combining a new streaming neighborhood graph construction, the corresponding topology computation, and a novel data aggregation scheme, namely topology aware datacubes, we enable interactive exploration of both the topological and the geometric aspect of high-dimensional data. Following two use cases from high-energy-density (HED) physics and computational biology, we demonstrate how these capabilities have led to crucial new insights in both applications.
Tasks graph construction, Topological Data Analysis
Published 2019-07-19
URL https://arxiv.org/abs/1907.08325v1
PDF https://arxiv.org/pdf/1907.08325v1.pdf
PWC https://paperswithcode.com/paper/scalable-topological-data-analysis-and
Repo https://github.com/rushilanirudh/macc
Framework tf

P3O: Policy-on Policy-off Policy Optimization

Title P3O: Policy-on Policy-off Policy Optimization
Authors Rasool Fakoor, Pratik Chaudhari, Alexander J. Smola
Abstract On-policy reinforcement learning (RL) algorithms have high sample complexity while off-policy algorithms are difficult to tune. Merging the two holds the promise to develop efficient algorithms that generalize across diverse environments. It is however challenging in practice to find suitable hyper-parameters that govern this trade off. This paper develops a simple algorithm named P3O that interleaves off-policy updates with on-policy updates. P3O uses the effective sample size between the behavior policy and the target policy to control how far they can be from each other and does not introduce any additional hyper-parameters. Extensive experiments on the Atari-2600 and MuJoCo benchmark suites show that this simple technique is effective in reducing the sample complexity of state-of-the-art algorithms. Code to reproduce experiments in this paper is at https://github.com/rasoolfa/P3O.
Tasks
Published 2019-05-05
URL https://arxiv.org/abs/1905.01756v2
PDF https://arxiv.org/pdf/1905.01756v2.pdf
PWC https://paperswithcode.com/paper/p3o-policy-on-policy-off-policy-optimization
Repo https://github.com/rasoolfa/P3O
Framework mxnet

The Omniglot challenge: a 3-year progress report

Title The Omniglot challenge: a 3-year progress report
Authors Brenden M. Lake, Ruslan Salakhutdinov, Joshua B. Tenenbaum
Abstract Three years ago, we released the Omniglot dataset for one-shot learning, along with five challenge tasks and a computational model that addresses these tasks. The model was not meant to be the final word on Omniglot; we hoped that the community would build on our work and develop new approaches. In the time since, we have been pleased to see wide adoption of the dataset. There has been notable progress on one-shot classification, but researchers have adopted new splits and procedures that make the task easier. There has been less progress on the other four tasks. We conclude that recent approaches are still far from human-like concept learning on Omniglot, a challenge that requires performing many tasks with a single model.
Tasks Omniglot, One-Shot Learning
Published 2019-02-09
URL https://arxiv.org/abs/1902.03477v2
PDF https://arxiv.org/pdf/1902.03477v2.pdf
PWC https://paperswithcode.com/paper/the-omniglot-challenge-a-3-year-progress
Repo https://github.com/schatty/matching-networks-tf
Framework tf
comments powered by Disqus