January 25, 2020

2976 words 14 mins read

Paper Group NAWR 42

Paper Group NAWR 42

Dynamic Fusion With Intra- and Inter-Modality Attention Flow for Visual Question Answering. RedTyp: A Database of Reduplication with Computational Models. An Encoding Strategy Based Word-Character LSTM for Chinese NER. Progressive Fusion Video Super-Resolution Network via Exploiting Non-Local Spatio-Temporal Correlations. Attentive Feedback Network …

Dynamic Fusion With Intra- and Inter-Modality Attention Flow for Visual Question Answering

Title Dynamic Fusion With Intra- and Inter-Modality Attention Flow for Visual Question Answering
Authors Peng Gao, Zhengkai Jiang, Haoxuan You, Pan Lu, Steven C. H. Hoi, Xiaogang Wang, Hongsheng Li
Abstract Learning effective fusion of multi-modality features is at the heart of visual question answering. We propose a novel method of dynamically fuse multi-modal features with intra- and inter-modality information flow, which alternatively pass dynamic information between and across the visual and language modalities. It can robustly capture the high-level interactions between language and vision domains, thus significantly improves the performance of visual question answering. We also show that, the proposed dynamic intra modality attention flow conditioned on the other modality can dynamically modulate the intra-modality attention of the current modality, which is vital for multimodality feature fusion. Experimental evaluations on the VQA 2.0 dataset show that the proposed method achieves the state-of-the-art VQA performance. Extensive ablation studies are carried out for the comprehensive analysis of the proposed method.
Tasks Question Answering, Visual Question Answering
Published 2019-06-01
URL http://openaccess.thecvf.com/content_CVPR_2019/html/Gao_Dynamic_Fusion_With_Intra-_and_Inter-Modality_Attention_Flow_for_Visual_CVPR_2019_paper.html
PDF http://openaccess.thecvf.com/content_CVPR_2019/papers/Gao_Dynamic_Fusion_With_Intra-_and_Inter-Modality_Attention_Flow_for_Visual_CVPR_2019_paper.pdf
PWC https://paperswithcode.com/paper/dynamic-fusion-with-intra-and-inter-modality-1
Repo https://github.com/bupt-cist/DFAF-for-VQA.pytorch
Framework pytorch

RedTyp: A Database of Reduplication with Computational Models

Title RedTyp: A Database of Reduplication with Computational Models
Authors Hossep Dolatian, Jeffrey Heinz
Abstract
Tasks
Published 2019-01-01
URL https://www.aclweb.org/anthology/W19-0102/
PDF https://www.aclweb.org/anthology/W19-0102
PWC https://paperswithcode.com/paper/redtyp-a-database-of-reduplication-with
Repo https://github.com/jhdeov/RedTyp
Framework none

An Encoding Strategy Based Word-Character LSTM for Chinese NER

Title An Encoding Strategy Based Word-Character LSTM for Chinese NER
Authors Wei Liu, Tongge Xu, Qinghua Xu, Jiayu Song, Yueran Zu
Abstract A recently proposed lattice model has demonstrated that words in character sequence can provide rich word boundary information for character-based Chinese NER model. In this model, word information is integrated into a shortcut path between the start and the end characters of the word. However, the existence of shortcut path may cause the model to degenerate into a partial word-based model, which will suffer from word segmentation errors. Furthermore, the lattice model can not be trained in batches due to its DAG structure. In this paper, we propose a novel word-character LSTM(WC-LSTM) model to add word information into the start or the end character of the word, alleviating the influence of word segmentation errors while obtaining the word boundary information. Four different strategies are explored in our model to encode word information into a fixed-sized representation for efficient batch training. Experiments on benchmark datasets show that our proposed model outperforms other state-of-the-arts models.
Tasks
Published 2019-06-01
URL https://www.aclweb.org/anthology/N19-1247/
PDF https://www.aclweb.org/anthology/N19-1247
PWC https://paperswithcode.com/paper/an-encoding-strategy-based-word-character
Repo https://github.com/liuwei1206/CCW-NER
Framework pytorch

Progressive Fusion Video Super-Resolution Network via Exploiting Non-Local Spatio-Temporal Correlations

Title Progressive Fusion Video Super-Resolution Network via Exploiting Non-Local Spatio-Temporal Correlations
Authors Peng Yi, Zhongyuan Wang, Kui Jiang, Junjun Jiang, Jiayi Ma
Abstract Most previous fusion strategies either fail to fully utilize temporal information or cost too much time, and how to effectively fuse temporal information from consecutive frames plays an important role in video super-resolution (SR). In this study, we propose a novel progressive fusion network for video SR, which is designed to make better use of spatio-temporal information and is proved to be more efficient and effective than the existing direct fusion, slow fusion or 3D convolution strategies. Under this progressive fusion framework, we further introduce an improved non-local operation to avoid the complex motion estimation and motion compensation (ME&MC) procedures as in previous video SR approaches. Extensive experiments on public datasets demonstrate that our method surpasses state-of-the-art with 0.96 dB in average, and runs about 3 times faster, while requires only about half of the parameters.
Tasks Motion Compensation, Motion Estimation, Super-Resolution, Video Super-Resolution
Published 2019-10-01
URL http://openaccess.thecvf.com/content_ICCV_2019/html/Yi_Progressive_Fusion_Video_Super-Resolution_Network_via_Exploiting_Non-Local_Spatio-Temporal_Correlations_ICCV_2019_paper.html
PDF http://openaccess.thecvf.com/content_ICCV_2019/papers/Yi_Progressive_Fusion_Video_Super-Resolution_Network_via_Exploiting_Non-Local_Spatio-Temporal_Correlations_ICCV_2019_paper.pdf
PWC https://paperswithcode.com/paper/progressive-fusion-video-super-resolution
Repo https://github.com/psychopa4/PFNL
Framework tf

Attentive Feedback Network for Boundary-Aware Salient Object Detection

Title Attentive Feedback Network for Boundary-Aware Salient Object Detection
Authors Mengyang Feng, Huchuan Lu, Errui Ding
Abstract Recent deep learning based salient object detection methods achieve gratifying performance built upon Fully Convolutional Neural Networks (FCNs). However, most of them have suffered from the boundary challenge. The state-of-the-art methods employ feature aggregation tech- nique and can precisely find out wherein the salient object, but they often fail to segment out the entire object with fine boundaries, especially those raised narrow stripes. So there is still a large room for improvement over the FCN based models. In this paper, we design the Attentive Feedback Modules (AFMs) to better explore the structure of objects. A Boundary-Enhanced Loss (BEL) is further employed for learning exquisite boundaries. Our proposed deep model produces satisfying results on the object boundaries and achieves state-of-the-art performance on five widely tested salient object detection benchmarks. The network is in a fully convolutional fashion running at a speed of 26 FPS and does not need any post-processing.
Tasks Object Detection, Salient Object Detection
Published 2019-06-01
URL http://openaccess.thecvf.com/content_CVPR_2019/html/Feng_Attentive_Feedback_Network_for_Boundary-Aware_Salient_Object_Detection_CVPR_2019_paper.html
PDF http://openaccess.thecvf.com/content_CVPR_2019/papers/Feng_Attentive_Feedback_Network_for_Boundary-Aware_Salient_Object_Detection_CVPR_2019_paper.pdf
PWC https://paperswithcode.com/paper/attentive-feedback-network-for-boundary-aware
Repo https://github.com/ArcherFMY/AFNet
Framework none

Superset Technique for Approximate Recovery in One-Bit Compressed Sensing

Title Superset Technique for Approximate Recovery in One-Bit Compressed Sensing
Authors Larkin Flodin, Venkata Gandikota, Arya Mazumdar
Abstract One-bit compressed sensing (1bCS) is a method of signal acquisition under extreme measurement quantization that gives important insights on the limits of signal compression and analog-to-digital conversion. The setting is also equivalent to the problem of learning a sparse hyperplane-classifier. In this paper, we propose a generic approach for signal recovery in nonadaptive 1bCS that leads to improved sample complexity for approximate recovery for a variety of signal models, including nonnegative signals and binary signals. We construct 1bCS matrices that are universal - i.e. work for all signals under a model - and at the same time recover very general random sparse signals with high probability. In our approach, we divide the set of samples (measurements) into two parts, and use the first part to recover the superset of the support of a sparse vector. The second set of measurements is then used to approximate the signal within the superset. While support recovery in 1bCS is well-studied, recovery of superset of the support requires fewer samples, which then leads to an overall reduction in sample complexity for approximate recovery.
Tasks Quantization
Published 2019-12-01
URL http://papers.nips.cc/paper/9226-superset-technique-for-approximate-recovery-in-one-bit-compressed-sensing
PDF http://papers.nips.cc/paper/9226-superset-technique-for-approximate-recovery-in-one-bit-compressed-sensing.pdf
PWC https://paperswithcode.com/paper/superset-technique-for-approximate-recovery
Repo https://github.com/flodinl/neurips-1bCS
Framework none

Learning Neural Representations for Network Anomaly Detection

Title Learning Neural Representations for Network Anomaly Detection
Authors Van Loi Cao, Miguel Nicolau, and James McDermott
Abstract This paper proposes latent representation models for improving network anomaly detection. Well-known anomaly detection algorithms often suffer from challenges posed by network data, such as high dimension and sparsity, and a lack of anomaly data for training, model selection, and hyperparameter tuning. Our approach is to introduce new regularizers to a classical autoencoder (AE) and a variational AE, which force normal data into a very tight area centered at the origin in the nonsaturating area of the bottleneck unit activations. These trained AEs on normal data will push normal points toward the origin, whereas anomalies, which differ from normal data, will be put far away from the normal region. The models are very different from common regularized AEs, sparse AE, and contractive AE,in which the regularized AEs tend to make their latent representation less sensitive to changes of the input data. The bottleneck feature space is now used as a new data representation. A number of one-class learning algorithms are used for evaluating the proposed models. The experiments testify that our models help these classifiers to perform efficiently and consistently on highdimensional and sparse network datasets, even with relatively few training points. More importantly, the models can minimize the effect of model selection on these classifiers since their performance is insensitive to a wide range of hyperparameter settings.
Tasks Anomaly Detection, Intrusion Detection, Model Selection, Unsupervised Anomaly Detection
Published 2019-08-01
URL https://ieeexplore.ieee.org/abstract/document/8386786
PDF https://ieeexplore.ieee.org/abstract/document/8386786
PWC https://paperswithcode.com/paper/learning-neural-representations-for-network
Repo https://github.com/vanloicao/SAEDVAE
Framework tf

DF2Net: A Dense-Fine-Finer Network for Detailed 3D Face Reconstruction

Title DF2Net: A Dense-Fine-Finer Network for Detailed 3D Face Reconstruction
Authors Xiaoxing Zeng, Xiaojiang Peng, Yu Qiao
Abstract Reconstructing the detailed geometric structure from a single face image is a challenging problem due to its ill-posed nature and the fine 3D structures to be recovered. This paper proposes a deep Dense-Fine-Finer Network (DF2Net) to address this challenging problem. DF2Net decomposes the reconstruction process into three stages, each of which is processed by an elaborately-designed network, namely D-Net, F-Net, and Fr-Net. D-Net exploits a U-net architecture to map the input image to a dense depth image. F-Net refines the output of D-Net by integrating features from depth and RGB domains, whose output is further enhanced by Fr-Net with a novel multi-resolution hypercolumn architecture. In addition, we introduce three types of data to train these networks, including 3D model synthetic data, 2D image reconstructed data, and fine facial images. We elaborately exploit different datasets (or combination) together with well-designed losses to train different networks. Qualitative evaluation indicates that our DF2Net can effectively reconstruct subtle facial details such as small crow’s feet and wrinkles. Our DF2Net achieves performance superior or comparable to state-of-the-art algorithms in qualitative and quantitative analyses on real-world images and the BU-3DFE dataset. Code and the collected 70K image-depth data will be publicly available.
Tasks 3D Face Reconstruction, Face Reconstruction
Published 2019-10-01
URL http://openaccess.thecvf.com/content_ICCV_2019/html/Zeng_DF2Net_A_Dense-Fine-Finer_Network_for_Detailed_3D_Face_Reconstruction_ICCV_2019_paper.html
PDF http://openaccess.thecvf.com/content_ICCV_2019/papers/Zeng_DF2Net_A_Dense-Fine-Finer_Network_for_Detailed_3D_Face_Reconstruction_ICCV_2019_paper.pdf
PWC https://paperswithcode.com/paper/df2net-a-dense-fine-finer-network-for
Repo https://github.com/xiaoxingzeng/DF2Net
Framework pytorch

Bidirectional Transition-Based Dependency Parsing

Title Bidirectional Transition-Based Dependency Parsing
Authors Yunzhe Yuan, Yong Jiang, Kewei Tu
Abstract Transition-based dependency parsing is a fast and effective approach for dependency parsing. Traditionally, a transitionbased dependency parser processes an input sentence and predicts a sequence of parsing actions in a left-to-right manner. During this process, an early prediction error may negatively impact the prediction of subsequent actions. In this paper, we propose a simple framework for bidirectional transitionbased parsing. During training, we learn a left-to-right parser and a right-to-left parser separately. To parse a sentence, we perform joint decoding with the two parsers. We propose three joint decoding algorithms that are based on joint scoring, dual decomposition, and dynamic oracle respectively. Empirical results show that our methods lead to competitive parsing accuracy and our method based on dynamic oracle consistently achieves the best performance.
Tasks Dependency Parsing, Transition-Based Dependency Parsing
Published 2019-07-17
URL https://aaai.org/ojs/index.php/AAAI/article/view/4733
PDF https://aaai.org/ojs/index.php/AAAI/article/view/4733/4611
PWC https://paperswithcode.com/paper/bidirectional-transition-based-dependency
Repo https://github.com/yuanyunzhe/bi-trans-parser
Framework none

Reconciling λ-Returns with Experience Replay

Title Reconciling λ-Returns with Experience Replay
Authors Brett Daley, Christopher Amato
Abstract Modern deep reinforcement learning methods have departed from the incremental learning required for eligibility traces, rendering the implementation of the λ-return difficult in this context. In particular, off-policy methods that utilize experience replay remain problematic because their random sampling of minibatches is not conducive to the efficient calculation of λ-returns. Yet replay-based methods are often the most sample efficient, and incorporating λ-returns into them is a viable way to achieve new state-of-the-art performance. Towards this, we propose the first method to enable practical use of λ-returns in arbitrary replay-based methods without relying on other forms of decorrelation such as asynchronous gradient updates. By promoting short sequences of past transitions into a small cache within the replay memory, adjacent λ-returns can be efficiently precomputed by sharing Q-values. Computation is not wasted on experiences that are never sampled, and stored λ-returns behave as stable temporal-difference (TD) targets that replace the target network. Additionally, our method grants the unique ability to observe TD errors prior to sampling; for the first time, transitions can be prioritized by their true significance rather than by a proxy to it. Furthermore, we propose the novel use of the TD error to dynamically select λ-values that facilitate faster learning. We show that these innovations can enhance the performance of DQN when playing Atari 2600 games, even under partial observability. While our work specifically focuses on λ-returns, these ideas are applicable to any multi-step return estimator.
Tasks Atari Games
Published 2019-12-01
URL http://papers.nips.cc/paper/8397-reconciling-returns-with-experience-replay
PDF http://papers.nips.cc/paper/8397-reconciling-returns-with-experience-replay.pdf
PWC https://paperswithcode.com/paper/reconciling-returns-with-experience-replay
Repo https://github.com/brett-daley/dqn-lambda
Framework tf

Learning an Event Sequence Embedding for Dense Event-Based Deep Stereo

Title Learning an Event Sequence Embedding for Dense Event-Based Deep Stereo
Authors Stepan Tulyakov, Francois Fleuret, Martin Kiefel, Peter Gehler, Michael Hirsch
Abstract Today, a frame-based camera is the sensor of choice for machine vision applications. However, these cameras, originally developed for acquisition of static images rather than for sensing of dynamic uncontrolled visual environments, suffer from high power consumption, data rate, latency and low dynamic range. An event-based image sensor addresses these drawbacks by mimicking a biological retina. Instead of measuring the intensity of every pixel in a fixed time-interval, it reports events of significant pixel intensity changes. Every such event is represented by its position, sign of change, and timestamp, accurate to the microsecond. Asynchronous event sequences require special handling, since traditional algorithms work only with synchronous, spatially gridded data. To address this problem we introduce a new module for event sequence embedding, for use in difference applications. The module builds a representation of an event sequence by firstly aggregating information locally across time, using a novel fully-connected layer for an irregularly sampled continuous domain, and then across discrete spatial domain. Based on this module, we design a deep learning-based stereo method for event-based cameras. The proposed method is the first learning-based stereo method for an event-based camera and the only method that produces dense results. We show that large performance increases on the Multi Vehicle Stereo Event Camera Dataset (MVSEC), which became the standard set for benchmarking of event-based stereo methods.
Tasks
Published 2019-10-01
URL http://openaccess.thecvf.com/content_ICCV_2019/html/Tulyakov_Learning_an_Event_Sequence_Embedding_for_Dense_Event-Based_Deep_Stereo_ICCV_2019_paper.html
PDF http://openaccess.thecvf.com/content_ICCV_2019/papers/Tulyakov_Learning_an_Event_Sequence_Embedding_for_Dense_Event-Based_Deep_Stereo_ICCV_2019_paper.pdf
PWC https://paperswithcode.com/paper/learning-an-event-sequence-embedding-for
Repo https://github.com/tlkvstepan/event_stereo_ICCV2019
Framework pytorch

Offline Contextual Bandits with High Probability Fairness Guarantees

Title Offline Contextual Bandits with High Probability Fairness Guarantees
Authors Blossom Metevier, Stephen Giguere, Sarah Brockman, Ari Kobren, Yuriy Brun, Emma Brunskill, Philip S. Thomas
Abstract We present RobinHood, an offline contextual bandit algorithm designed to satisfy a broad family of fairness constraints. Our algorithm accepts multiple fairness definitions and allows users to construct their own unique fairness definitions for the problem at hand. We provide a theoretical analysis of RobinHood, which includes a proof that it will not return an unfair solution with probability greater than a user-specified threshold. We validate our algorithm on three applications: a tutoring system in which we conduct a user study and consider multiple unique fairness definitions; a loan approval setting (using the Statlog German credit data set) in which well-known fairness definitions are applied; and criminal recidivism (using data released by ProPublica). In each setting, our algorithm is able to produce fair policies that achieve performance competitive with other offline and online contextual bandit algorithms.
Tasks Multi-Armed Bandits
Published 2019-12-01
URL http://papers.nips.cc/paper/9630-offline-contextual-bandits-with-high-probability-fairness-guarantees
PDF http://papers.nips.cc/paper/9630-offline-contextual-bandits-with-high-probability-fairness-guarantees.pdf
PWC https://paperswithcode.com/paper/offline-contextual-bandits-with-high
Repo https://github.com/sgiguere/RobinHood-NeurIPS-2019
Framework none

BehaveNet: nonlinear embedding and Bayesian neural decoding of behavioral videos

Title BehaveNet: nonlinear embedding and Bayesian neural decoding of behavioral videos
Authors Eleanor Batty, Matthew Whiteway, Shreya Saxena, Dan Biderman, Taiga Abe, Simon Musall, Winthrop Gillis, Jeffrey Markowitz, Anne Churchland, John P. Cunningham, Sandeep R. Datta, Scott Linderman, Liam Paninski
Abstract A fundamental goal of systems neuroscience is to understand the relationship between neural activity and behavior. Behavior has traditionally been characterized by low-dimensional, task-related variables such as movement speed or response times. More recently, there has been a growing interest in automated analysis of high-dimensional video data collected during experiments. Here we introduce a probabilistic framework for the analysis of behavioral video and neural activity. This framework provides tools for compression, segmentation, generation, and decoding of behavioral videos. Compression is performed using a convolutional autoencoder (CAE), which yields a low-dimensional continuous representation of behavior. We then use an autoregressive hidden Markov model (ARHMM) to segment the CAE representation into discrete “behavioral syllables.” The resulting generative model can be used to simulate behavioral video data. Finally, based on this generative model, we develop a novel Bayesian decoding approach that takes in neural activity and outputs probabilistic estimates of the full-resolution behavioral video. We demonstrate this framework on two different experimental paradigms using distinct behavioral and neural recording technologies.
Tasks
Published 2019-12-01
URL http://papers.nips.cc/paper/9701-behavenet-nonlinear-embedding-and-bayesian-neural-decoding-of-behavioral-videos
PDF http://papers.nips.cc/paper/9701-behavenet-nonlinear-embedding-and-bayesian-neural-decoding-of-behavioral-videos.pdf
PWC https://paperswithcode.com/paper/behavenet-nonlinear-embedding-and-bayesian
Repo https://github.com/ebatty/behavenet
Framework pytorch

Graph-based Dependency Parsing with Graph Neural Networks

Title Graph-based Dependency Parsing with Graph Neural Networks
Authors Tao Ji, Yuanbin Wu, Man Lan
Abstract We investigate the problem of efficiently incorporating high-order features into neural graph-based dependency parsing. Instead of explicitly extracting high-order features from intermediate parse trees, we develop a more powerful dependency tree node representation which captures high-order information concisely and efficiently. We use graph neural networks (GNNs) to learn the representations and discuss several new configurations of GNN{'}s updating and aggregation functions. Experiments on PTB show that our parser achieves the best UAS and LAS on PTB (96.0{%}, 94.3{%}) among systems without using any external resources.
Tasks Dependency Parsing
Published 2019-07-01
URL https://www.aclweb.org/anthology/P19-1237/
PDF https://www.aclweb.org/anthology/P19-1237
PWC https://paperswithcode.com/paper/graph-based-dependency-parsing-with-graph
Repo https://github.com/AntNLP/gnn-dep-parsing
Framework none

Molecule Property Prediction Based on Spatial Graph Embedding

Title Molecule Property Prediction Based on Spatial Graph Embedding
Authors Xiaofeng Wang, Zhen Li, Mingjian Jiang, Shuang Wang, Shugang Zhang, Zhiqiang Wei
Abstract Accurate prediction of molecular properties is important for new compound design, which is a crucial step in drug discovery. In this paper, molecular graph data is utilized for property prediction based on graph convolution neural networks. In addition, a convolution spatial graph embedding layer (C-SGEL) is introduced to retain the spatial connection information on molecules. And, multiple C-SGELs are stacked to construct a convolution spatial graph embedding network (C-SGEN) for end-to-end representation learning. In order to enhance the robustness of the network, molecular fingerprints are also combined with C-SGEN to build a composite model for predicting molecular properties. Our comparative experiments have shown that our method is accurate and achieves the best results on some open benchmark datasets.
Tasks Drug Discovery, Graph Embedding, Graph Regression, Representation Learning
Published 2019-08-22
URL https://doi.org/10.1021/acs.jcim.9b00410
PDF https://pubs.acs.org/doi/pdf/10.1021/acs.jcim.9b00410?rand=oin4mnup
PWC https://paperswithcode.com/paper/molecule-property-prediction-based-on-spatial
Repo https://github.com/wxfsd/C-SGEN
Framework pytorch
comments powered by Disqus