Paper Group AWR 342
Investigating Self-Attention Network for Chinese Word Segmentation. A Graph-based Model for Joint Chinese Word Segmentation and Dependency Parsing. SuperTML: Two-Dimensional Word Embedding for the Precognition on Structured Tabular Data. Weighted Linear Bandits for Non-Stationary Environments. ReMASC: Realistic Replay Attack Corpus for Voice Contro …
Investigating Self-Attention Network for Chinese Word Segmentation
Title | Investigating Self-Attention Network for Chinese Word Segmentation |
Authors | Leilei Gan, Yue Zhang |
Abstract | Neural network has become the dominant method for Chinese word segmentation. Most existing models cast the task as sequence labeling, using BiLSTM-CRF for representing the input and making output predictions. Recently, attention-based sequence models have emerged as a highly competitive alternative to LSTMs, which allow better running speed by parallelization of computation. We investigate self attention network for Chinese word segmentation, making comparisons between BiLSTM-CRF models. In addition, the influence of contextualized character embeddings is investigated using BERT, and a method is proposed for integrating word information into SAN segmentation. Results show that SAN gives highly competitive results compared with BiLSTMs, with BERT and word information further improving segmentation for in-domain and cross-domain segmentation. Our final models give the best results for 6 heterogenous domain benchmarks. |
Tasks | Chinese Word Segmentation |
Published | 2019-07-26 |
URL | https://arxiv.org/abs/1907.11512v1 |
https://arxiv.org/pdf/1907.11512v1.pdf | |
PWC | https://paperswithcode.com/paper/investigating-self-attention-network-for |
Repo | https://github.com/gump88/SAN-CWS |
Framework | pytorch |
A Graph-based Model for Joint Chinese Word Segmentation and Dependency Parsing
Title | A Graph-based Model for Joint Chinese Word Segmentation and Dependency Parsing |
Authors | Hang Yan, Xipeng Qiu, Xuanjing Huang |
Abstract | Chinese word segmentation and dependency parsing are two fundamental tasks for Chinese natural language processing. The dependency parsing is defined on word-level. Therefore word segmentation is the precondition of dependency parsing, which makes dependency parsing suffer from error propagation and unable to directly make use of the character-level pre-trained language model (such as BERT). In this paper, we propose a graph-based model to integrate Chinese word segmentation and dependency parsing. Different from previous transition-based joint models, our proposed model is more concise, which results in fewer efforts of feature engineering. Our graph-based joint model achieves better performance than previous joint models and state-of-the-art results in both Chinese word segmentation and dependency parsing. Besides, when BERT is combined, our model can substantially reduce the performance gap of dependency parsing between joint models and gold-segmented word-based models. Our code is publicly available at https://github.com/fastnlp/JointCwsParser. |
Tasks | Chinese Word Segmentation, Dependency Parsing, Feature Engineering, Language Modelling |
Published | 2019-04-09 |
URL | https://arxiv.org/abs/1904.04697v2 |
https://arxiv.org/pdf/1904.04697v2.pdf | |
PWC | https://paperswithcode.com/paper/a-unified-model-for-joint-chinese-word |
Repo | https://github.com/fastnlp/JointCwsParser |
Framework | pytorch |
SuperTML: Two-Dimensional Word Embedding for the Precognition on Structured Tabular Data
Title | SuperTML: Two-Dimensional Word Embedding for the Precognition on Structured Tabular Data |
Authors | Baohua Sun, Lin Yang, Wenhan Zhang, Michael Lin, Patrick Dong, Charles Young, Jason Dong |
Abstract | Tabular data is the most commonly used form of data in industry. Gradient Boosting Trees, Support Vector Machine, Random Forest, and Logistic Regression are typically used for classification tasks on tabular data. DNN models using categorical embeddings are also applied in this task, but all attempts thus far have used one-dimensional embeddings. The recent work of Super Characters method using two-dimensional word embeddings achieved the state of art result in text classification tasks, showcasing the promise of this new approach. In this paper, we propose the SuperTML method, which borrows the idea of Super Characters method and two-dimensional embeddings to address the problem of classification on tabular data. For each input of tabular data, the features are first projected into two-dimensional embeddings like an image, and then this image is fed into fine-tuned two-dimensional CNN models for classification. Experimental results have shown that the proposed SuperTML method had achieved state-of-the-art results on both large and small datasets. |
Tasks | Text Classification, Transfer Learning, Word Embeddings |
Published | 2019-02-26 |
URL | https://arxiv.org/abs/1903.06246v3 |
https://arxiv.org/pdf/1903.06246v3.pdf | |
PWC | https://paperswithcode.com/paper/supertml-two-dimensional-word-embedding-and |
Repo | https://github.com/SSinyu/p |
Framework | tf |
Weighted Linear Bandits for Non-Stationary Environments
Title | Weighted Linear Bandits for Non-Stationary Environments |
Authors | Yoan Russac, Claire Vernade, Olivier Cappé |
Abstract | We consider a stochastic linear bandit model in which the available actions correspond to arbitrary context vectors whose associated rewards follow a non-stationary linear regression model. In this setting, the unknown regression parameter is allowed to vary in time. To address this problem, we propose D-LinUCB, a novel optimistic algorithm based on discounted linear regression, where exponential weights are used to smoothly forget the past. This involves studying the deviations of the sequential weighted least-squares estimator under generic assumptions. As a by-product, we obtain novel deviation results that can be used beyond non-stationary environments. We provide theoretical guarantees on the behavior of D-LinUCB in both slowly-varying and abruptly-changing environments. We obtain an upper bound on the dynamic regret that is of order d^{2/3} B_T^{1/3}T^{2/3}, where B_T is a measure of non-stationarity (d and T being, respectively, dimension and horizon). This rate is known to be optimal. We also illustrate the empirical performance of D-LinUCB and compare it with recently proposed alternatives in simulated environments. |
Tasks | |
Published | 2019-09-19 |
URL | https://arxiv.org/abs/1909.09146v2 |
https://arxiv.org/pdf/1909.09146v2.pdf | |
PWC | https://paperswithcode.com/paper/weighted-linear-bandits-for-non-stationary |
Repo | https://github.com/YRussac/WeightedLinearBandits |
Framework | none |
ReMASC: Realistic Replay Attack Corpus for Voice Controlled Systems
Title | ReMASC: Realistic Replay Attack Corpus for Voice Controlled Systems |
Authors | Yuan Gong, Jian Yang, Jacob Huber, Mitchell MacKnight, Christian Poellabauer |
Abstract | This paper introduces a new database of voice recordings with the goal of supporting research on vulnerabilities and protection of voice-controlled systems (VCSs). In contrast to prior efforts, the proposed database contains both genuine voice commands and replayed recordings of such commands, collected in realistic VCSs usage scenarios and using modern voice assistant development kits. Specifically, the database contains recordings from four systems (each with a different microphone array) in a variety of environmental conditions with different forms of background noise and relative positions between speaker and device. To the best of our knowledge, this is the first publicly available database that has been specifically designed for the protection of state-of-the-art voice-controlled systems against various replay attacks in various conditions and environments. |
Tasks | |
Published | 2019-04-06 |
URL | https://arxiv.org/abs/1904.03365v2 |
https://arxiv.org/pdf/1904.03365v2.pdf | |
PWC | https://paperswithcode.com/paper/remasc-realistic-replay-attack-corpus-for |
Repo | https://github.com/YuanGongND/ReMASC |
Framework | none |
SC-FEGAN: Face Editing Generative Adversarial Network with User’s Sketch and Color
Title | SC-FEGAN: Face Editing Generative Adversarial Network with User’s Sketch and Color |
Authors | Youngjoo Jo, Jongyoul Park |
Abstract | We present a novel image editing system that generates images as the user provides free-form mask, sketch and color as an input. Our system consist of a end-to-end trainable convolutional network. Contrary to the existing methods, our system wholly utilizes free-form user input with color and shape. This allows the system to respond to the user’s sketch and color input, using it as a guideline to generate an image. In our particular work, we trained network with additional style loss which made it possible to generate realistic results, despite large portions of the image being removed. Our proposed network architecture SC-FEGAN is well suited to generate high quality synthetic image using intuitive user inputs. |
Tasks | Facial Inpainting, Image Inpainting |
Published | 2019-02-18 |
URL | http://arxiv.org/abs/1902.06838v1 |
http://arxiv.org/pdf/1902.06838v1.pdf | |
PWC | https://paperswithcode.com/paper/sc-fegan-face-editing-generative-adversarial |
Repo | https://github.com/run-youngjoo/SC-FEGAN |
Framework | tf |
Non-Stationary Markov Decision Processes, a Worst-Case Approach using Model-Based Reinforcement Learning, Extended version
Title | Non-Stationary Markov Decision Processes, a Worst-Case Approach using Model-Based Reinforcement Learning, Extended version |
Authors | Erwan Lecarpentier, Emmanuel Rachelson |
Abstract | This work tackles the problem of robust zero-shot planning in non-stationary stochastic environments. We study Markov Decision Processes (MDPs) evolving over time and consider Model-Based Reinforcement Learning algorithms in this setting. We make two hypotheses: 1) the environment evolves continuously with a bounded evolution rate; 2) a current model is known at each decision epoch but not its evolution. Our contribution can be presented in four points. 1) we define a specific class of MDPs that we call Non-Stationary MDPs (NSMDPs). We introduce the notion of regular evolution by making an hypothesis of Lipschitz-Continuity on the transition and reward functions w.r.t. time; 2) we consider a planning agent using the current model of the environment but unaware of its future evolution. This leads us to consider a worst-case method where the environment is seen as an adversarial agent; 3) following this approach, we propose the Risk-Averse Tree-Search (RATS) algorithm, a zero-shot Model-Based method similar to Minimax search; 4) we illustrate the benefits brought by RATS empirically and compare its performance with reference Model-Based algorithms. |
Tasks | |
Published | 2019-04-22 |
URL | https://arxiv.org/abs/1904.10090v4 |
https://arxiv.org/pdf/1904.10090v4.pdf | |
PWC | https://paperswithcode.com/paper/non-stationary-markov-decision-processes-a |
Repo | https://github.com/SuReLI/rats-experiments |
Framework | none |
A Topological Loss Function for Deep-Learning based Image Segmentation using Persistent Homology
Title | A Topological Loss Function for Deep-Learning based Image Segmentation using Persistent Homology |
Authors | James R. Clough, Ilkay Oksuz, Nicholas Byrne, Veronika A. Zimmer, Julia A. Schnabel, Andrew P. King |
Abstract | We introduce a method for training neural networks to perform image or volume segmentation in which prior knowledge about the topology of the segmented object can be explicitly provided and then incorporated into the training process. By using the differentiable properties of persistent homology, a concept used in topological data analysis, we can specify the desired topology of segmented objects in terms of their Betti numbers and then drive the proposed segmentations to contain the specified topological features. Importantly this process does not require any ground-truth labels, just prior knowledge of the topology of the structure being segmented. We demonstrate our approach in three experiments. Firstly we create a synthetic task in which handwritten MNIST digits are de-noised, and show that using this kind of topological prior knowledge in the training of the network significantly improves the quality of the de-noised digits. Secondly we perform an experiment in which the task is segmenting the myocardium of the left ventricle from cardiac magnetic resonance images. We show that the incorporation of the prior knowledge of the topology of this anatomy improves the resulting segmentations in terms of both the topological accuracy and the Dice coefficient. Thirdly, we extend the method to 3D volumes and demonstrate its performance on the task of segmenting the placenta from ultrasound data, again showing that incorporating topological priors improves performance on this challenging task. We find that embedding explicit prior knowledge in neural network segmentation tasks is most beneficial when the segmentation task is especially challenging and that it can be used in either a semi-supervised or post-processing context to extract a useful training gradient from images without pixelwise labels. |
Tasks | Semantic Segmentation, Topological Data Analysis |
Published | 2019-10-04 |
URL | https://arxiv.org/abs/1910.01877v1 |
https://arxiv.org/pdf/1910.01877v1.pdf | |
PWC | https://paperswithcode.com/paper/a-topological-loss-function-for-deep-learning |
Repo | https://github.com/yuki3-18/Topological-DVAE |
Framework | pytorch |
Graph Neural Ordinary Differential Equations
Title | Graph Neural Ordinary Differential Equations |
Authors | Michael Poli, Stefano Massaroli, Junyoung Park, Atsushi Yamashita, Hajime Asama, Jinkyoo Park |
Abstract | We introduce the framework of continuous–depth graph neural networks (GNNs). Graph neural ordinary differential equations (GDEs) are formalized as the counterpart to GNNs where the input-output relationship is determined by a continuum of GNN layers, blending discrete topological structures and differential equations. The proposed framework is shown to be compatible with various static and autoregressive GNN models. Results prove general effectiveness of GDEs: in static settings they offer computational advantages by incorporating numerical methods in their forward pass; in dynamic settings, on the other hand, they are shown to improve performance by exploiting the geometry of the underlying dynamics. |
Tasks | |
Published | 2019-11-18 |
URL | https://arxiv.org/abs/1911.07532v2 |
https://arxiv.org/pdf/1911.07532v2.pdf | |
PWC | https://paperswithcode.com/paper/graph-neural-ordinary-differential-equations |
Repo | https://github.com/Zymrael/gde |
Framework | pytorch |
Multigrid Predictive Filter Flow for Unsupervised Learning on Videos
Title | Multigrid Predictive Filter Flow for Unsupervised Learning on Videos |
Authors | Shu Kong, Charless Fowlkes |
Abstract | We introduce multigrid Predictive Filter Flow (mgPFF), a framework for unsupervised learning on videos. The mgPFF takes as input a pair of frames and outputs per-pixel filters to warp one frame to the other. Compared to optical flow used for warping frames, mgPFF is more powerful in modeling sub-pixel movement and dealing with corruption (e.g., motion blur). We develop a multigrid coarse-to-fine modeling strategy that avoids the requirement of learning large filters to capture large displacement. This allows us to train an extremely compact model (4.6MB) which operates in a progressive way over multiple resolutions with shared weights. We train mgPFF on unsupervised, free-form videos and show that mgPFF is able to not only estimate long-range flow for frame reconstruction and detect video shot transitions, but also readily amendable for video object segmentation and pose tracking, where it substantially outperforms the published state-of-the-art without bells and whistles. Moreover, owing to mgPFF’s nature of per-pixel filter prediction, we have the unique opportunity to visualize how each pixel is evolving during solving these tasks, thus gaining better interpretability. |
Tasks | Optical Flow Estimation, Pose Tracking, Semantic Segmentation, Skeleton Based Action Recognition, Video Object Segmentation, Video Semantic Segmentation |
Published | 2019-04-02 |
URL | http://arxiv.org/abs/1904.01693v1 |
http://arxiv.org/pdf/1904.01693v1.pdf | |
PWC | https://paperswithcode.com/paper/multigrid-predictive-filter-flow-for |
Repo | https://github.com/bestaar/predictiveFilterFlow |
Framework | none |
Regularizing Neural Networks by Stochastically Training Layer Ensembles
Title | Regularizing Neural Networks by Stochastically Training Layer Ensembles |
Authors | Alex Labach, Shahrokh Valaee |
Abstract | Dropout and similar stochastic neural network regularization methods are often interpreted as implicitly averaging over a large ensemble of models. We propose STE (stochastically trained ensemble) layers, which enhance the averaging properties of such methods by training an ensemble of weight matrices with stochastic regularization while explicitly averaging outputs. This provides stronger regularization with no additional computational cost at test time. We show consistent improvement on various image classification tasks using standard network topologies. |
Tasks | Image Classification |
Published | 2019-11-21 |
URL | https://arxiv.org/abs/1911.09669v1 |
https://arxiv.org/pdf/1911.09669v1.pdf | |
PWC | https://paperswithcode.com/paper/regularizing-neural-networks-by |
Repo | https://github.com/j201/keras-ste-layers |
Framework | none |
Weakly Aligned Cross-Modal Learning for Multispectral Pedestrian Detection
Title | Weakly Aligned Cross-Modal Learning for Multispectral Pedestrian Detection |
Authors | Lu Zhang, Xiangyu Zhu, Xiangyu Chen, Xu Yang, Zhen Lei, Zhiyong Liu |
Abstract | Multispectral pedestrian detection has shown great advantages under poor illumination conditions, since the thermal modality provides complementary information for the color image. However, real multispectral data suffers from the position shift problem, i.e. the color-thermal image pairs are not strictly aligned, making one object has different positions in different modalities. In deep learning based methods, this problem makes it difficult to fuse the feature maps from both modalities and puzzles the CNN training. In this paper, we propose a novel Aligned Region CNN (AR-CNN) to handle the weakly aligned multispectral data in an end-to-end way. Firstly, we design a Region Feature Alignment (RFA) module to capture the position shift and adaptively align the region features of the two modalities. Secondly, we present a new multimodal fusion method, which performs feature re-weighting to select more reliable features and suppress the useless ones. Besides, we propose a novel RoI jitter strategy to improve the robustness to unexpected shift patterns of different devices and system settings. Finally, since our method depends on a new kind of labelling: bounding boxes that match each modality, we manually relabel the KAIST dataset by locating bounding boxes in both modalities and building their relationships, providing a new KAIST-Paired Annotation. Extensive experimental validations on existing datasets are performed, demonstrating the effectiveness and robustness of the proposed method. Code and data are available at https://github.com/luzhang16/AR-CNN. |
Tasks | Pedestrian Detection |
Published | 2019-01-09 |
URL | https://arxiv.org/abs/1901.02645v2 |
https://arxiv.org/pdf/1901.02645v2.pdf | |
PWC | https://paperswithcode.com/paper/the-cross-modality-disparity-problem-in |
Repo | https://github.com/luzhang16/AR-CNN |
Framework | none |
Scalable Topological Data Analysis and Visualization for Evaluating Data-Driven Models in Scientific Applications
Title | Scalable Topological Data Analysis and Visualization for Evaluating Data-Driven Models in Scientific Applications |
Authors | Shusen Liu, Di Wang, Dan Maljovec, Rushil Anirudh, Jayaraman J. Thiagarajan, Sam Ade Jacobs, Brian C. Van Essen, David Hysom, Jae-Seung Yeom, Jim Gaffney, Luc Peterson, Peter B. Robinson, Harsh Bhatia, Valerio Pascucci, Brian K. Spears, Peer-Timo Bremer |
Abstract | With the rapid adoption of machine learning techniques for large-scale applications in science and engineering comes the convergence of two grand challenges in visualization. First, the utilization of black box models (e.g., deep neural networks) calls for advanced techniques in exploring and interpreting model behaviors. Second, the rapid growth in computing has produced enormous datasets that require techniques that can handle millions or more samples. Although some solutions to these interpretability challenges have been proposed, they typically do not scale beyond thousands of samples, nor do they provide the high-level intuition scientists are looking for. Here, we present the first scalable solution to explore and analyze high-dimensional functions often encountered in the scientific data analysis pipeline. By combining a new streaming neighborhood graph construction, the corresponding topology computation, and a novel data aggregation scheme, namely topology aware datacubes, we enable interactive exploration of both the topological and the geometric aspect of high-dimensional data. Following two use cases from high-energy-density (HED) physics and computational biology, we demonstrate how these capabilities have led to crucial new insights in both applications. |
Tasks | graph construction, Topological Data Analysis |
Published | 2019-07-19 |
URL | https://arxiv.org/abs/1907.08325v1 |
https://arxiv.org/pdf/1907.08325v1.pdf | |
PWC | https://paperswithcode.com/paper/scalable-topological-data-analysis-and |
Repo | https://github.com/rushilanirudh/macc |
Framework | tf |
P3O: Policy-on Policy-off Policy Optimization
Title | P3O: Policy-on Policy-off Policy Optimization |
Authors | Rasool Fakoor, Pratik Chaudhari, Alexander J. Smola |
Abstract | On-policy reinforcement learning (RL) algorithms have high sample complexity while off-policy algorithms are difficult to tune. Merging the two holds the promise to develop efficient algorithms that generalize across diverse environments. It is however challenging in practice to find suitable hyper-parameters that govern this trade off. This paper develops a simple algorithm named P3O that interleaves off-policy updates with on-policy updates. P3O uses the effective sample size between the behavior policy and the target policy to control how far they can be from each other and does not introduce any additional hyper-parameters. Extensive experiments on the Atari-2600 and MuJoCo benchmark suites show that this simple technique is effective in reducing the sample complexity of state-of-the-art algorithms. Code to reproduce experiments in this paper is at https://github.com/rasoolfa/P3O. |
Tasks | |
Published | 2019-05-05 |
URL | https://arxiv.org/abs/1905.01756v2 |
https://arxiv.org/pdf/1905.01756v2.pdf | |
PWC | https://paperswithcode.com/paper/p3o-policy-on-policy-off-policy-optimization |
Repo | https://github.com/rasoolfa/P3O |
Framework | mxnet |
The Omniglot challenge: a 3-year progress report
Title | The Omniglot challenge: a 3-year progress report |
Authors | Brenden M. Lake, Ruslan Salakhutdinov, Joshua B. Tenenbaum |
Abstract | Three years ago, we released the Omniglot dataset for one-shot learning, along with five challenge tasks and a computational model that addresses these tasks. The model was not meant to be the final word on Omniglot; we hoped that the community would build on our work and develop new approaches. In the time since, we have been pleased to see wide adoption of the dataset. There has been notable progress on one-shot classification, but researchers have adopted new splits and procedures that make the task easier. There has been less progress on the other four tasks. We conclude that recent approaches are still far from human-like concept learning on Omniglot, a challenge that requires performing many tasks with a single model. |
Tasks | Omniglot, One-Shot Learning |
Published | 2019-02-09 |
URL | https://arxiv.org/abs/1902.03477v2 |
https://arxiv.org/pdf/1902.03477v2.pdf | |
PWC | https://paperswithcode.com/paper/the-omniglot-challenge-a-3-year-progress |
Repo | https://github.com/schatty/matching-networks-tf |
Framework | tf |