January 25, 2020

2831 words 14 mins read

Paper Group NANR 16

Paper Group NANR 16

Multi-Scale Stacked Hourglass Network for Human Pose Estimation. Tweet Stance Detection Using an Attention based Neural Ensemble Model. Training Data Augmentation for Detecting Adverse Drug Reactions in User-Generated Content. Improving UD processing via satellite resources for morphology. Deep Embedding Learning With Discriminative Sampling Policy …

Multi-Scale Stacked Hourglass Network for Human Pose Estimation

Title Multi-Scale Stacked Hourglass Network for Human Pose Estimation
Authors Chunsheng Guo, Wenlong Du, Na Ying
Abstract Stacked hourglass network has become an important model for Human pose estimation. The estimation of human body posture depends on the global information of the keypoints type and the local information of the keypoints location. The consistent processing of inputs and constraints makes it difficult to form differentiated and determined collaboration mechanisms for each stacked hourglass network. In this paper, we propose a Multi-Scale Stacked Hourglass (MSSH) network to high-light the differentiation capabilities of each Hourglass network for human pose estimation. The pre-processing network forms feature maps of different scales,and dispatch them to various locations of the stack hourglass network, where the small-scale features reach the front of stacked hourglass network, and large-scale features reach the rear of stacked hourglass network. And a new loss function is proposed for multi-scale stacked hourglass network. Different keypoints have different weight coefficients of loss function at different scales, and the keypoints weight coefficients are dynamically adjusted from the top-level hourglass network to the bottom-level hourglass network. Experimental results show that the pro-posed method is competitive with respect to the comparison algorithm on MPII and LSP datasets.
Tasks Pose Estimation
Published 2019-05-01
URL https://openreview.net/forum?id=HkM3vjCcF7
PDF https://openreview.net/pdf?id=HkM3vjCcF7
PWC https://paperswithcode.com/paper/multi-scale-stacked-hourglass-network-for
Repo
Framework

Tweet Stance Detection Using an Attention based Neural Ensemble Model

Title Tweet Stance Detection Using an Attention based Neural Ensemble Model
Authors Umme Aymun Siddiqua, Abu Nowshed Chy, Masaki Aono
Abstract Stance detection in twitter aims at mining user stances expressed in a tweet towards a single or multiple target entities. To tackle this problem, most of the prior studies have been explored the traditional deep learning models, e.g., LSTM and GRU. However, in compared to these traditional approaches, recently proposed densely connected Bi-LSTM and nested LSTMs architectures effectively address the vanishing-gradient and overfitting problems as well as dealing with long-term dependencies. In this paper, we propose a neural ensemble model that adopts the strengths of these two LSTM variants to learn better long-term dependencies, where each module coupled with an attention mechanism that amplifies the contribution of important elements in the final representation. We also employ a multi-kernel convolution on top of them to extract the higher-level tweet representations. Results of extensive experiments on single and multi-target stance detection datasets show that our proposed method achieves substantial improvement over the current state-of-the-art deep learning based methods.
Tasks Stance Detection
Published 2019-06-01
URL https://www.aclweb.org/anthology/N19-1185/
PDF https://www.aclweb.org/anthology/N19-1185
PWC https://paperswithcode.com/paper/tweet-stance-detection-using-an-attention
Repo
Framework

Training Data Augmentation for Detecting Adverse Drug Reactions in User-Generated Content

Title Training Data Augmentation for Detecting Adverse Drug Reactions in User-Generated Content
Authors Sepideh Mesbah, Jie Yang, Robert-Jan Sips, Manuel Valle Torre, Christoph Lofi, Aless Bozzon, ro, Geert-Jan Houben
Abstract Social media provides a timely yet challenging data source for adverse drug reaction (ADR) detection. Existing dictionary-based, semi-supervised learning approaches are intrinsically limited by the coverage and maintainability of laymen health vocabularies. In this paper, we introduce a data augmentation approach that leverages variational autoencoders to learn high-quality data distributions from a large unlabeled dataset, and subsequently, to automatically generate a large labeled training set from a small set of labeled samples. This allows for efficient social-media ADR detection with low training and re-training costs to adapt to the changes and emergence of informal medical laymen terms. An extensive evaluation performed on Twitter and Reddit data shows that our approach matches the performance of fully-supervised approaches while requiring only 25{%} of training data.
Tasks Data Augmentation
Published 2019-11-01
URL https://www.aclweb.org/anthology/D19-1239/
PDF https://www.aclweb.org/anthology/D19-1239
PWC https://paperswithcode.com/paper/training-data-augmentation-for-detecting
Repo
Framework

Improving UD processing via satellite resources for morphology

Title Improving UD processing via satellite resources for morphology
Authors Kaja Dobrovoljc, Toma{\v{z}} Erjavec, Nikola Ljube{\v{s}}i{'c}
Abstract
Tasks
Published 2019-08-01
URL https://www.aclweb.org/anthology/W19-8004/
PDF https://www.aclweb.org/anthology/W19-8004
PWC https://paperswithcode.com/paper/improving-ud-processing-via-satellite
Repo
Framework

Deep Embedding Learning With Discriminative Sampling Policy

Title Deep Embedding Learning With Discriminative Sampling Policy
Authors Yueqi Duan, Lei Chen, Jiwen Lu, Jie Zhou
Abstract Deep embedding learning aims to learn a distance metric for effective similarity measurement, which has achieved promising performance in various tasks. As the vast majority of training samples produce gradients with magnitudes close to zero, hard example mining is usually employed to improve the effectiveness and efficiency of the training procedure. However, most existing sampling methods are designed by hand, which ignores the dependence between examples and suffer from exhaustive searching. In this paper, we propose a deep embedding with discriminative sampling policy (DE-DSP) learning framework by simultaneously training two models: a deep sampler network that learns effective sampling strategies, and a feature embedding that maps samples to the feature space. Rather than exhaustively calculating the hardness of all the examples for mining through forward-propagation, the deep sampler network exploits the strong prior of relations among samples to learn discriminative sampling policy in an more efficient manner. Experimental results demonstrate faster convergence and stronger discriminative power of our DE-DSP framework under different embedding objectives.
Tasks
Published 2019-06-01
URL http://openaccess.thecvf.com/content_CVPR_2019/html/Duan_Deep_Embedding_Learning_With_Discriminative_Sampling_Policy_CVPR_2019_paper.html
PDF http://openaccess.thecvf.com/content_CVPR_2019/papers/Duan_Deep_Embedding_Learning_With_Discriminative_Sampling_Policy_CVPR_2019_paper.pdf
PWC https://paperswithcode.com/paper/deep-embedding-learning-with-discriminative
Repo
Framework

Sparse Dictionary Learning by Dynamical Neural Networks

Title Sparse Dictionary Learning by Dynamical Neural Networks
Authors Tsung-Han Lin, Ping Tak Peter Tang
Abstract A dynamical neural network consists of a set of interconnected neurons that interact over time continuously. It can exhibit computational properties in the sense that the dynamical system’s evolution and/or limit points in the associated state space can correspond to numerical solutions to certain mathematical optimization or learning problems. Such a computational system is particularly attractive in that it can be mapped to a massively parallel computer architecture for power and throughput efficiency, especially if each neuron can rely solely on local information (i.e., local memory). Deriving gradients from the dynamical network’s various states while conforming to this last constraint, however, is challenging. We show that by combining ideas of top-down feedback and contrastive learning, a dynamical network for solving the l1-minimizing dictionary learning problem can be constructed, and the true gradients for learning are provably computable by individual neurons. Using spiking neurons to construct our dynamical network, we present a learning process, its rigorous mathematical analysis, and numerical results on several dictionary learning problems.
Tasks Dictionary Learning
Published 2019-05-01
URL https://openreview.net/forum?id=B1gstsCqt7
PDF https://openreview.net/pdf?id=B1gstsCqt7
PWC https://paperswithcode.com/paper/sparse-dictionary-learning-by-dynamical
Repo
Framework

Constructing Interpretive Spatio-Temporal Features for Multi-Turn Responses Selection

Title Constructing Interpretive Spatio-Temporal Features for Multi-Turn Responses Selection
Authors Junyu Lu, Chenbin Zhang, Zeying Xie, Guang Ling, Tom Chao Zhou, Zenglin Xu
Abstract Response selection plays an important role in fully automated dialogue systems. Given the dialogue context, the goal of response selection is to identify the best-matched next utterance (i.e., response) from multiple candidates. Despite the efforts of many previous useful models, this task remains challenging due to the huge semantic gap and also the large size of candidate set. To address these issues, we propose a Spatio-Temporal Matching network (STM) for response selection. In detail, soft alignment is first used to obtain the local relevance between the context and the response. And then, we construct spatio-temporal features by aggregating attention images in time dimension and make use of 3D convolution and pooling operations to extract matching information. Evaluation on two large-scale multi-turn response selection tasks has demonstrated that our proposed model significantly outperforms the state-of-the-art model. Particularly, visualization analysis shows that the spatio-temporal features enables matching information in segment pairs and time sequences, and have good interpretability for multi-turn text matching.
Tasks Text Matching
Published 2019-07-01
URL https://www.aclweb.org/anthology/P19-1006/
PDF https://www.aclweb.org/anthology/P19-1006
PWC https://paperswithcode.com/paper/constructing-interpretive-spatio-temporal
Repo
Framework

Enhanced Pix2pix Dehazing Network

Title Enhanced Pix2pix Dehazing Network
Authors Yanyun Qu, Yizi Chen, Jingying Huang, Yuan Xie
Abstract In this paper, we reduce the image dehazing problem to an image-to-image translation problem, and propose Enhanced Pix2pix Dehazing Network (EPDN), which generates a haze-free image without relying on the physical scattering model. EPDN is embedded by a generative adversarial network, which is followed by a well-designed enhancer. Inspired by visual perception global-first theory, the discriminator guides the generator to create a pseudo realistic image on a coarse scale, while the enhancer following the generator is required to produce a realistic dehazing image on the fine scale. The enhancer contains two enhancing blocks based on the receptive field model, which reinforces the dehazing effect in both color and details. The embedded GAN is jointly trained with the enhancer. Extensive experiment results on synthetic datasets and real-world datasets show that the proposed EPDN is superior to the state-of-the-art methods in terms of PSNR, SSIM, PI, and subjective visual effect.
Tasks Image Dehazing, Image-to-Image Translation
Published 2019-06-01
URL http://openaccess.thecvf.com/content_CVPR_2019/html/Qu_Enhanced_Pix2pix_Dehazing_Network_CVPR_2019_paper.html
PDF http://openaccess.thecvf.com/content_CVPR_2019/papers/Qu_Enhanced_Pix2pix_Dehazing_Network_CVPR_2019_paper.pdf
PWC https://paperswithcode.com/paper/enhanced-pix2pix-dehazing-network
Repo
Framework

Deep Learning for Identification of Adverse Effect Mentions In Twitter Data

Title Deep Learning for Identification of Adverse Effect Mentions In Twitter Data
Authors Paul Barry, Ozlem Uzuner
Abstract Social Media Mining for Health Applications (SMM4H) Adverse Effect Mentions Shared Task challenges participants to accurately identify spans of text within a tweet that correspond to Adverse Effects (AEs) resulting from medication usage (Weissenbacher et al., 2019). This task features a training data set of 2,367 tweets, in addition to a 1,000 tweet evaluation data set. The solution presented here features a bidirectional Long Short-term Memory Network (bi-LSTM) for the generation of character-level embeddings. It uses a second bi-LSTM trained on both character and token level embeddings to feed a Conditional Random Field (CRF) which provides the final classification. This paper further discusses the deep learning algorithms used in our solution.
Tasks
Published 2019-08-01
URL https://www.aclweb.org/anthology/W19-3215/
PDF https://www.aclweb.org/anthology/W19-3215
PWC https://paperswithcode.com/paper/deep-learning-for-identification-of-adverse
Repo
Framework

Bat-G net: Bat-inspired High-Resolution 3D Image Reconstruction using Ultrasonic Echoes

Title Bat-G net: Bat-inspired High-Resolution 3D Image Reconstruction using Ultrasonic Echoes
Authors Gunpil Hwang, Seohyeon Kim, Hyeon-Min Bae
Abstract In this paper, a bat-inspired high-resolution ultrasound 3D imaging system is presented. Live bats demonstrate that the properly used ultrasound can be used to perceive 3D space. With this in mind, a neural network referred to as a Bat-G network is implemented to reconstruct the 3D representation of target objects from the hyperbolic FM (HFM) chirped ultrasonic echoes. The Bat-G network consists of an encoder emulating a bat’s central auditory pathway, and a 3D graphical visualization decoder. For the acquisition of the ultrasound data, a custom-made Bat-I sensor module is used. The Bat-G network shows the uniform 3D reconstruction results and achieves precision, recall, and F1-score of 0.896, 0.899 and 0.895, respectively. The experimental results demonstrate the implementation feasibility of a high-resolution non-optical sound-based imaging system being used by live bats. The project web page (https://sites.google.com/view/batgnet) contains additional content summarizing our research.
Tasks 3D Reconstruction, Image Reconstruction
Published 2019-12-01
URL http://papers.nips.cc/paper/8629-bat-g-net-bat-inspired-high-resolution-3d-image-reconstruction-using-ultrasonic-echoes
PDF http://papers.nips.cc/paper/8629-bat-g-net-bat-inspired-high-resolution-3d-image-reconstruction-using-ultrasonic-echoes.pdf
PWC https://paperswithcode.com/paper/bat-g-net-bat-inspired-high-resolution-3d
Repo
Framework

Questions in Dependent Type Semantics

Title Questions in Dependent Type Semantics
Authors Kazuki Watanabe, Koji Mineshima, Daisuke Bekki
Abstract Dependent Type Semantics (DTS; Bekki and Mineshima, 2017) is a proof-theoretic compositional dynamic semantics based on Dependent Type Theory. The semantic representations for declarative sentences in DTS are types, based on the propositions-as-types paradigm. While type-theoretic semantics for natural language based on dependent type theory has been developed by many authors, how to assign semantic representations to interrogative sentences has been a non-trivial problem. In this study, we show how to provide the semantics of interrogative sentences in DTS. The basic idea is to assign the same type to both declarative sentences and interrogative sentences, partly building on the recent proposal in Inquisitive Semantics. We use Combinatory Categorial Grammar (CCG) as a syntactic component of DTS and implement our compositional semantics for interrogative sentences using ccg2lambda, a semantic parsing platform based on CCG. Based on the idea that the relationship between questions and answers can be formulated as the task of Recognizing Textual Entailment (RTE), we implement our inference system using proof assistant Coq and show that our system can deal with a wide range of question-answer relationships discussed in the formal semantics literature, including those with polar questions, alternative questions, and wh-questions.
Tasks Natural Language Inference, Semantic Parsing
Published 2019-05-01
URL https://www.aclweb.org/anthology/W19-1103/
PDF https://www.aclweb.org/anthology/W19-1103
PWC https://paperswithcode.com/paper/questions-in-dependent-type-semantics
Repo
Framework

Toyota Smarthome: Real-World Activities of Daily Living

Title Toyota Smarthome: Real-World Activities of Daily Living
Authors Srijan Das, Rui Dai, Michal Koperski, Luca Minciullo, Lorenzo Garattoni, Francois Bremond, Gianpiero Francesca
Abstract The performance of deep neural networks is strongly influenced by the quantity and quality of annotated data. Most of the large activity recognition datasets consist of data sourced from the web, which does not reflect challenges that exist in activities of daily living. In this paper, we introduce a large real-world video dataset for activities of daily living: Toyota Smarthome. The dataset consists of 16K RGB+D clips of 31 activity classes, performed by seniors in a smarthome. Unlike previous datasets, videos were fully unscripted. As a result, the dataset poses several challenges: high intra-class variation, high class imbalance, simple and composite activities, and activities with similar motion and variable duration. Activities were annotated with both coarse and fine-grained labels. These characteristics differentiate Toyota Smarthome from other datasets for activity recognition. As recent activity recognition approaches fail to address the challenges posed by Toyota Smarthome, we present a novel activity recognition method with attention mechanism. We propose a pose driven spatio-temporal attention mechanism through 3D ConvNets. We show that our novel method outperforms state-of-the-art methods on benchmark datasets, as well as on the Toyota Smarthome dataset. We release the dataset for research use.
Tasks Activity Recognition
Published 2019-10-01
URL http://openaccess.thecvf.com/content_ICCV_2019/html/Das_Toyota_Smarthome_Real-World_Activities_of_Daily_Living_ICCV_2019_paper.html
PDF http://openaccess.thecvf.com/content_ICCV_2019/papers/Das_Toyota_Smarthome_Real-World_Activities_of_Daily_Living_ICCV_2019_paper.pdf
PWC https://paperswithcode.com/paper/toyota-smarthome-real-world-activities-of
Repo
Framework

Revisiting Radial Distortion Absolute Pose

Title Revisiting Radial Distortion Absolute Pose
Authors Viktor Larsson, Torsten Sattler, Zuzana Kukelova, Marc Pollefeys
Abstract To model radial distortion there are two main approaches; either the image points are undistorted such that they correspond to pinhole projections, or the pinhole projections are distorted such that they align with the image measurements. Depending on the application, either of the two approaches can be more suitable. For example, distortion models are commonly used in Structure-from-Motion since they simplify measuring the reprojection error in images. Surprisingly, all previous minimal solvers for pose estimation with radial distortion use undistortion models. In this paper we aim to fill this gap in the literature by proposing the first minimal solvers which can jointly estimate distortion models together with camera pose. We present a general approach which can handle rational models of arbitrary degree for both distortion and undistortion.
Tasks Pose Estimation
Published 2019-10-01
URL http://openaccess.thecvf.com/content_ICCV_2019/html/Larsson_Revisiting_Radial_Distortion_Absolute_Pose_ICCV_2019_paper.html
PDF http://openaccess.thecvf.com/content_ICCV_2019/papers/Larsson_Revisiting_Radial_Distortion_Absolute_Pose_ICCV_2019_paper.pdf
PWC https://paperswithcode.com/paper/revisiting-radial-distortion-absolute-pose
Repo
Framework

Single Image Reflection Removal Beyond Linearity

Title Single Image Reflection Removal Beyond Linearity
Authors Qiang Wen, Yinjie Tan, Jing Qin, Wenxi Liu, Guoqiang Han, Shengfeng He
Abstract Due to the lack of paired data, the training of image reflection removal relies heavily on synthesizing reflection images. However, existing methods model reflection as a linear combination model, which cannot fully simulate the real-world scenarios. In this paper, we inject non-linearity into reflection removal from two aspects. First, instead of synthesizing reflection with a fixed combination factor or kernel, we propose to synthesize reflection images by predicting a non-linear alpha blending mask. This enables a free combination of different blurry kernels, leading to a controllable and diverse reflection synthesis. Second, we design a cascaded network for reflection removal with three tasks: predicting the transmission layer, reflection layer, and the non-linear alpha blending mask. The former two tasks are the fundamental outputs, while the latter one being the side output of the network. This side output, on the other hand, making the training a closed loop, so that the separated transmission and reflection layers can be recombined together for training with a reconstruction loss. Extensive quantitative and qualitative experiments demonstrate the proposed synthesis and removal approaches outperforms state-of-the-art methods on two standard benchmarks, as well as in real-world scenarios.
Tasks
Published 2019-06-01
URL http://openaccess.thecvf.com/content_CVPR_2019/html/Wen_Single_Image_Reflection_Removal_Beyond_Linearity_CVPR_2019_paper.html
PDF http://openaccess.thecvf.com/content_CVPR_2019/papers/Wen_Single_Image_Reflection_Removal_Beyond_Linearity_CVPR_2019_paper.pdf
PWC https://paperswithcode.com/paper/single-image-reflection-removal-beyond
Repo
Framework

Neural Cross-Lingual Event Detection with Minimal Parallel Resources

Title Neural Cross-Lingual Event Detection with Minimal Parallel Resources
Authors Jian Liu, Yubo Chen, Kang Liu, Jun Zhao
Abstract The scarcity in annotated data poses a great challenge for event detection (ED). Cross-lingual ED aims to tackle this challenge by transferring knowledge between different languages to boost performance. However, previous cross-lingual methods for ED demonstrated a heavy dependency on parallel resources, which might limit their applicability. In this paper, we propose a new method for cross-lingual ED, demonstrating a minimal dependency on parallel resources. Specifically, to construct a lexical mapping between different languages, we devise a context-dependent translation method; to treat the word order difference problem, we propose a shared syntactic order event detector for multilingual co-training. The efficiency of our method is studied through extensive experiments on two standard datasets. Empirical results indicate that our method is effective in 1) performing cross-lingual transfer concerning different directions and 2) tackling the extremely annotation-poor scenario.
Tasks Cross-Lingual Transfer
Published 2019-11-01
URL https://www.aclweb.org/anthology/D19-1068/
PDF https://www.aclweb.org/anthology/D19-1068
PWC https://paperswithcode.com/paper/neural-cross-lingual-event-detection-with
Repo
Framework
comments powered by Disqus