January 31, 2020

3474 words 17 mins read

Paper Group ANR 194

Bias Remediation in Driver Drowsiness Detection systems using Generative Adversarial Networks. Deep Coarse-to-fine Dense Light Field Reconstruction with Flexible Sampling and Geometry-aware Fusion. Biomedical Mention Disambiguation using a Deep Learning Approach. Efficiency and Scalability of Multi-Lane Capsule Networks (MLCN). Using Chinese Glyphs …

Bias Remediation in Driver Drowsiness Detection systems using Generative Adversarial Networks


Title	Bias Remediation in Driver Drowsiness Detection systems using Generative Adversarial Networks
Authors	Mkhuseli Ngxande, Jules-Raymond Tapamo, Michael Burke
Abstract	Datasets are crucial when training a deep neural network. When datasets are unrepresentative, trained models are prone to bias because they are unable to generalise to real world settings. This is particularly problematic for models trained in specific cultural contexts, which may not represent a wide range of races, and thus fail to generalise. This is a particular challenge for Driver drowsiness detection, where many publicly available datasets are unrepresentative as they cover only certain ethnicity groups. Traditional augmentation methods are unable to improve a model’s performance when tested on other groups with different facial attributes, and it is often challenging to build new, more representative datasets. In this paper, we introduce a novel framework that boosts the performance of detection of drowsiness for different ethnicity groups. Our framework improves Convolutional Neural Network (CNN) trained for prediction by using Generative Adversarial networks (GAN) for targeted data augmentation based on a population bias visualisation strategy that groups faces with similar facial attributes and highlights where the model is failing. A sampling method selects faces where the model is not performing well, which are used to fine-tune the CNN. Experiments show the efficacy of our approach in improving driver drowsiness detection for under represented ethnicity groups. Here, models trained on publicly available datasets are compared with a model trained using the proposed data augmentation strategy. Although developed in the context of driver drowsiness detection, the proposed framework is not limited to the driver drowsiness detection task, but can be applied to other applications.
Tasks	Data Augmentation
Published	2019-12-10
URL	https://arxiv.org/abs/1912.12123v1
PDF	https://arxiv.org/pdf/1912.12123v1.pdf
PWC	https://paperswithcode.com/paper/bias-remediation-in-driver-drowsiness
Repo
Framework

Deep Coarse-to-fine Dense Light Field Reconstruction with Flexible Sampling and Geometry-aware Fusion


Title	Deep Coarse-to-fine Dense Light Field Reconstruction with Flexible Sampling and Geometry-aware Fusion
Authors	Jing Jin, Junhui Hou, Jie Chen, Huanqiang Zeng, Sam Kwong, Jingyi Yu
Abstract	A densely-sampled light field (LF) is highly desirable in various applications, such as 3-D reconstruction, post-capture refocusing and virtual reality. However, it is costly to acquire such data. Although many computational methods have been proposed to reconstruct a densely-sampled LF from a sparsely-sampled one, they still suffer from either low reconstruction quality, low computational efficiency, or the restriction on the regularity of the sampling pattern. To this end, we propose a novel learning-based method, which accepts sparsely-sampled LFs with irregular structures, and produces densely-sampled LFs with arbitrary angular resolution accurately and efficiently. We also propose a simple yet effective method for optimizing the sampling pattern. Our proposed method, an end-to-end trainable network, reconstructs a densely-sampled LF in a coarse-to-fine manner. Specifically, the coarse sub-aperture image (SAI) synthesis module first explores the scene geometry from an unstructured sparsely-sampled LF and leverages it to independently synthesize novel SAIs, in which a confidence-based blending strategy is proposed to fuse the information from different input SAIs, giving an intermediate densely-sampled LF. Then, the efficient LF refinement module learns the angular relationship within the intermediate result to recover the LF parallax structure. Comprehensive experimental evaluations demonstrate the superiority of our method on both real-world and synthetic LF images when compared with state-of-the-art methods. In addition, we illustrate the benefits and advantages of the proposed approach when applied in various LF-based applications, including image-based rendering and depth estimation enhancement.
Tasks	Depth Estimation
Published	2019-08-31
URL	https://arxiv.org/abs/1909.01341v2
PDF	https://arxiv.org/pdf/1909.01341v2.pdf
PWC	https://paperswithcode.com/paper/flexible-fast-and-accurate-densely-sampled
Repo
Framework

Biomedical Mention Disambiguation using a Deep Learning Approach


Title	Biomedical Mention Disambiguation using a Deep Learning Approach
Authors	Chih-Hsuan Wei, Kyubum Lee, Robert Leaman, Zhiyong Lu
Abstract	Automatically locating named entities in natural language text - named entity recognition - is an important task in the biomedical domain. Many named entity mentions are ambiguous between several bioconcept types, however, causing text spans to be annotated as more than one type when simultaneously recognizing multiple entity types. The straightforward solution is a rule-based approach applying a priority order based on the precision of each entity tagger (from highest to lowest). While this method is straightforward and useful, imprecise disambiguation remains a significant source of error. We address this issue by generating a partially labeled corpus of ambiguous concept mentions. We first collect named entity mentions from multiple human-curated databases (e.g. CTDbase, gene2pubmed), then correlate them with the text mined span from PubTator to provide the context where the mention appears. Our corpus contains more than 3 million concept mentions that ambiguous between one or more concept types in PubTator (about 3% of all mentions). We approached this task as a classification problem and developed a deep learning-based method which uses the semantics of the span being classified and the surrounding words to identify the most likely bioconcept type. More specifically, we develop a convolutional neural network (CNN) and along short-term memory (LSTM) network to respectively handle the semantic syntax features, then concatenate these within a fully connected layer for final classification. The priority ordering rule-based approach demonstrated F1-scores of 71.29% (micro-averaged) and 41.19% (macro-averaged), while the new disambiguation method demonstrated F1-scores of 91.94% (micro-averaged) and 85.42% (macro-averaged), a very substantial increase.
Tasks	Named Entity Recognition
Published	2019-09-23
URL	https://arxiv.org/abs/1909.10416v1
PDF	https://arxiv.org/pdf/1909.10416v1.pdf
PWC	https://paperswithcode.com/paper/190910416
Repo
Framework

Efficiency and Scalability of Multi-Lane Capsule Networks (MLCN)


Title	Efficiency and Scalability of Multi-Lane Capsule Networks (MLCN)
Authors	Vanderson M. do Rosario, Mauricio Breternitz Jr., Edson Borin
Abstract	Some Deep Neural Networks (DNN) have what we call lanes, or they can be reorganized as such. Lanes are paths in the network which are data-independent and typically learn different features or add resilience to the network. Given their data-independence, lanes are amenable for parallel processing. The Multi-lane CapsNet (MLCN) is a proposed reorganization of the Capsule Network which is shown to achieve better accuracy while bringing highly-parallel lanes. However, the efficiency and scalability of MLCN had not been systematically examined. In this work, we study the MLCN network with multiple GPUs finding that it is 2x more efficient than the original CapsNet when using model-parallelism. Further, we present the load balancing problem of distributing heterogeneous lanes in homogeneous or heterogeneous accelerators and show that a simple greedy heuristic can be almost 50% faster than a naive random approach.
Tasks
Published	2019-08-11
URL	https://arxiv.org/abs/1908.03935v1
PDF	https://arxiv.org/pdf/1908.03935v1.pdf
PWC	https://paperswithcode.com/paper/efficiency-and-scalability-of-multi-lane
Repo
Framework

Using Chinese Glyphs for Named Entity Recognition


Title	Using Chinese Glyphs for Named Entity Recognition
Authors	Arijit Sehanobish, Chan Hee Song
Abstract	Most Named Entity Recognition (NER) systems use additional features like part-of-speech (POS) tags, shallow parsing, gazetteers, etc. Such kind of information requires external knowledge like unlabeled texts and trained taggers. Adding these features to NER systems have been shown to have a positive impact. However, sometimes creating gazetteers or taggers can take a lot of time and may require extensive data cleaning. In this paper for Chinese NER systems, we do not use these traditional features but we use lexicographic features of Chinese characters. Chinese characters are composed of graphical components called radicals and these components often have some semantic indicators. We propose CNN based models that incorporate this semantic information and use them for NER. Our models show an improvement over the baseline BERT-BiLSTM-CRF model. We set a new baseline score for Chinese OntoNotes v5.0 and show an improvement of +.64 F1 score. We present a state-of-the-art F1 score on Weibo dataset of 71.81 and show a competitive improvement of +0.72 over baseline on ResumeNER dataset.
Tasks	Named Entity Recognition
Published	2019-09-22
URL	https://arxiv.org/abs/1909.09922v2
PDF	https://arxiv.org/pdf/1909.09922v2.pdf
PWC	https://paperswithcode.com/paper/190909922
Repo
Framework

Named Entity Recognition with Partially Annotated Training Data


Title	Named Entity Recognition with Partially Annotated Training Data
Authors	Stephen Mayhew, Snigdha Chaturvedi, Chen-Tse Tsai, Dan Roth
Abstract	Supervised machine learning assumes the availability of fully-labeled data, but in many cases, such as low-resource languages, the only data available is partially annotated. We study the problem of Named Entity Recognition (NER) with partially annotated training data in which a fraction of the named entities are labeled, and all other tokens, entities or otherwise, are labeled as non-entity by default. In order to train on this noisy dataset, we need to distinguish between the true and false negatives. To this end, we introduce a constraint-driven iterative algorithm that learns to detect false negatives in the noisy set and downweigh them, resulting in a weighted training set. With this set, we train a weighted NER model. We evaluate our algorithm with weighted variants of neural and non-neural NER models on data in 8 languages from several language and script families, showing strong ability to learn from partial data. Finally, to show real-world efficacy, we evaluate on a Bengali NER corpus annotated by non-speakers, outperforming the prior state-of-the-art by over 5 points F1.
Tasks	Named Entity Recognition
Published	2019-09-20
URL	https://arxiv.org/abs/1909.09270v1
PDF	https://arxiv.org/pdf/1909.09270v1.pdf
PWC	https://paperswithcode.com/paper/named-entity-recognition-with-partially
Repo
Framework

Adversarial Pyramid Network for Video Domain Generalization


Title	Adversarial Pyramid Network for Video Domain Generalization
Authors	Zhiyu Yao, Yunbo Wang, Xingqiang Du, Mingsheng Long, Jianmin Wang
Abstract	This paper introduces a new research problem of video domain generalization (video DG) where most state-of-the-art action recognition networks degenerate due to the lack of exposure to the target domains of divergent distributions. While recent advances in video understanding focus on capturing the temporal relations of the long-term video context, we observe that the global temporal features are less generalizable in the video DG settings. The reason is that videos from other unseen domains may have unexpected absence, misalignment, or scale transformation of the temporal relations, which is known as the temporal domain shift. Therefore, the video DG is even more challenging than the image DG, which is also under-explored, because of the entanglement of the spatial and temporal domain shifts. This finding has led us to view the key to video DG as how to effectively learn the local-relation features of different time scales that are more generalizable, and how to exploit them along with the global-relation features to maintain the discriminability. This paper presents the Adversarial Pyramid Network (APN), which captures the local-relation, global-relation, and multilayer cross-relation features progressively. This pyramid network not only improves the feature transferability from the view of representation learning, but also enhances the diversity and quality of the new data points that can bridge different domains when it is integrated with an improved version of the image DG adversarial data augmentation method. We construct four video DG benchmarks: UCF-HMDB, Something-Something, PKU-MMD, and NTU, in which the source and target domains are divided according to different datasets, different consequences of actions, or different camera views. The APN consistently outperforms previous action recognition models over all benchmarks.
Tasks	Data Augmentation, Domain Generalization, Representation Learning, Video Understanding
Published	2019-12-08
URL	https://arxiv.org/abs/1912.03716v1
PDF	https://arxiv.org/pdf/1912.03716v1.pdf
PWC	https://paperswithcode.com/paper/adversarial-pyramid-network-for-video-domain
Repo
Framework

Learning the Relation between Code Features and Code Transforms with Structured Prediction


Title	Learning the Relation between Code Features and Code Transforms with Structured Prediction
Authors	Zhongxing Yu, Matias Martinez, Tegawendé F. Bissyandé, Martin Monperrus
Abstract	We present in this paper the first approach for structurally predicting code transforms at the level of AST nodes using conditional random fields. Our approach first learns offline a probabilistic model that captures how certain code transforms are applied to certain AST nodes, and then uses the learned model to predict transforms for new, unseen code snippets. We implement our approach in the context of repair transform prediction for Java programs. Our implementation contains a set of carefully designed code features, deals with the training data imbalance issue, and comprises transform constraints that are specific to code. We conduct a large-scale experimental evaluation based on a dataset of 4,590,679 bug fixing commits from real-world Java projects. The experimental results show that our approach predicts the code transforms with a success rate varying from 37.1% to 61.1% depending on the transforms.
Tasks	Structured Prediction
Published	2019-07-22
URL	https://arxiv.org/abs/1907.09282v1
PDF	https://arxiv.org/pdf/1907.09282v1.pdf
PWC	https://paperswithcode.com/paper/learning-the-relation-between-code-features
Repo
Framework

Framelet Representation of Tensor Nuclear Norm for Third-Order Tensor Completion


Title	Framelet Representation of Tensor Nuclear Norm for Third-Order Tensor Completion
Authors	Tai-Xiang Jiang, Michael K. Ng, Xi-Le Zhao, Ting-Zhu Huang
Abstract	The main aim of this paper is to develop a framelet representation of the tensor nuclear norm for third-order tensor completion. In the literature, the tensor nuclear norm can be computed by using tensor singular value decomposition based on the discrete Fourier transform matrix, and tensor completion can be performed by the minimization of the tensor nuclear norm which is the relaxation of the sum of matrix ranks from all Fourier transformed matrix frontal slices. These Fourier transformed matrix frontal slices are obtained by applying the discrete Fourier transform on the tubes of the original tensor. In this paper, we propose to employ the framelet representation of each tube so that a framelet transformed tensor can be constructed. Because of framelet basis redundancy, the representation of each tube is sparsely represented. When the matrix slices of the original tensor are highly correlated, we expect the corresponding sum of matrix ranks from all framelet transformed matrix frontal slices would be small, and the resulting tensor completion can be performed much better. The proposed minimization model is convex and global minimizers can be obtained. Numerical results on several types of multi-dimensional data (videos, multispectral images, and magnetic resonance imaging data) have tested and shown that the proposed method outperformed the other testing methods.
Tasks
Published	2019-09-16
URL	https://arxiv.org/abs/1909.06982v2
PDF	https://arxiv.org/pdf/1909.06982v2.pdf
PWC	https://paperswithcode.com/paper/framelet-representation-of-tensor-nuclear
Repo
Framework

Knowledge Squeezed Adversarial Network Compression


Title	Knowledge Squeezed Adversarial Network Compression
Authors	Shu Changyong, Li Peng, Xie Yuan, Qu Yanyun, Dai Longquan, Ma Lizhuang
Abstract	Deep network compression has been achieved notable progress via knowledge distillation, where a teacher-student learning manner is adopted by using predetermined loss. Recently, more focuses have been transferred to employ the adversarial training to minimize the discrepancy between distributions of output from two networks. However, they always emphasize on result-oriented learning while neglecting the scheme of process-oriented learning, leading to the loss of rich information contained in the whole network pipeline. Inspired by the assumption that, the small network can not perfectly mimic a large one due to the huge gap of network scale, we propose a knowledge transfer method, involving effective intermediate supervision, under the adversarial training framework to learn the student network. To achieve powerful but highly compact intermediate information representation, the squeezed knowledge is realized by task-driven attention mechanism. Then, the transferred knowledge from teacher network could accommodate the size of student network. As a result, the proposed method integrates merits from both process-oriented and result-oriented learning. Extensive experimental results on three typical benchmark datasets, i.e., CIFAR-10, CIFAR-100, and ImageNet, demonstrate that our method achieves highly superior performances against other state-of-the-art methods.
Tasks	Transfer Learning
Published	2019-04-10
URL	http://arxiv.org/abs/1904.05100v2
PDF	http://arxiv.org/pdf/1904.05100v2.pdf
PWC	https://paperswithcode.com/paper/knowledge-squeezed-adversarial-network
Repo
Framework

Learning with Known Operators reduces Maximum Training Error Bounds


Title	Learning with Known Operators reduces Maximum Training Error Bounds
Authors	Andreas K. Maier, Christopher Syben, Bernhard Stimpel, Tobias Würfl, Mathis Hoffmann, Frank Schebesch, Weilin Fu, Leonid Mill, Lasse Kling, Silke Christiansen
Abstract	We describe an approach for incorporating prior knowledge into machine learning algorithms. We aim at applications in physics and signal processing in which we know that certain operations must be embedded into the algorithm. Any operation that allows computation of a gradient or sub-gradient towards its inputs is suited for our framework. We derive a maximal error bound for deep nets that demonstrates that inclusion of prior knowledge results in its reduction. Furthermore, we also show experimentally that known operators reduce the number of free parameters. We apply this approach to various tasks ranging from CT image reconstruction over vessel segmentation to the derivation of previously unknown imaging algorithms. As such the concept is widely applicable for many researchers in physics, imaging, and signal processing. We assume that our analysis will support further investigation of known operators in other fields of physics, imaging, and signal processing.
Tasks	Image Reconstruction
Published	2019-07-03
URL	https://arxiv.org/abs/1907.01992v1
PDF	https://arxiv.org/pdf/1907.01992v1.pdf
PWC	https://paperswithcode.com/paper/learning-with-known-operators-reduces-maximum
Repo
Framework

From English to Code-Switching: Transfer Learning with Strong Morphological Clues


Title	From English to Code-Switching: Transfer Learning with Strong Morphological Clues
Authors	Gustavo Aguilar, Thamar Solorio
Abstract	Code-switching is still an understudied phenomenon in natural language processing mainly because of two related challenges: it lacks annotated data, and it combines a vast diversity of low-resource languages. Despite the language diversity, many code-switching scenarios occur in language pairs, and English is often a common factor among them. In the first part of this paper, we use transfer learning from English to English-paired code-switched languages for the language identification (LID) task by applying two simple yet effective techniques: 1) a hierarchical attention mechanism that enhances morphological clues from character n-grams, and 2) a secondary loss that forces the model to learn n-gram representations that are particular to the languages involved. We use the bottom layers of the ELMo architecture to learn these morphological clues by essentially recognizing what is and what is not English. Our approach outperforms the previous state of the art on Nepali-English, Spanish-English, and Hindi-English datasets. In the second part of the paper, we use our best LID models for the tasks of Spanish-English named entity recognition and Hindi-English part-of-speech tagging by replacing their inference layers and retraining them. We show that our retrained models are capable of using the code-switching information on both tasks to outperform models that do not have such knowledge.
Tasks	Language Identification, Named Entity Recognition, Part-Of-Speech Tagging, Transfer Learning
Published	2019-09-11
URL	https://arxiv.org/abs/1909.05158v2
PDF	https://arxiv.org/pdf/1909.05158v2.pdf
PWC	https://paperswithcode.com/paper/from-english-to-code-switching-transfer
Repo
Framework

What Matters for Neural Cross-Lingual Named Entity Recognition: An Empirical Analysis


Title	What Matters for Neural Cross-Lingual Named Entity Recognition: An Empirical Analysis
Authors	Xiaolei Huang, Jonathan May, Nanyun Peng
Abstract	Building named entity recognition (NER) models for languages that do not have much training data is a challenging task. While recent work has shown promising results on cross-lingual transfer from high-resource languages to low-resource languages, it is unclear what knowledge is transferred. In this paper, we first propose a simple and efficient neural architecture for cross-lingual NER. Experiments show that our model achieves competitive performance with the state-of-the-art. We further analyze how transfer learning works for cross-lingual NER on two transferable factors: sequential order and multilingual embeddings, and investigate how model performance varies across entity lengths. Finally, we conduct a case-study on a non-Latin language, Bengali, which suggests that leveraging knowledge from Wikipedia will be a promising direction to further improve the model performances. Our results can shed light on future research for improving cross-lingual NER.
Tasks	Cross-Lingual Transfer, Named Entity Recognition, Transfer Learning
Published	2019-09-09
URL	https://arxiv.org/abs/1909.03598v1
PDF	https://arxiv.org/pdf/1909.03598v1.pdf
PWC	https://paperswithcode.com/paper/what-matters-for-neural-cross-lingual-named
Repo
Framework

Czech Text Processing with Contextual Embeddings: POS Tagging, Lemmatization, Parsing and NER


Title	Czech Text Processing with Contextual Embeddings: POS Tagging, Lemmatization, Parsing and NER
Authors	Milan Straka, Jana Straková, Jan Hajič
Abstract	Contextualized embeddings, which capture appropriate word meaning depending on context, have recently been proposed. We evaluate two meth ods for precomputing such embeddings, BERT and Flair, on four Czech text processing tasks: part-of-speech (POS) tagging, lemmatization, dependency pars ing and named entity recognition (NER). The first three tasks, POS tagging, lemmatization and dependency parsing, are evaluated on two corpora: the Prague Dependency Treebank 3.5 and the Universal Dependencies 2.3. The named entity recognition (NER) is evaluated on the Czech Named Entity Corpus 1.1 and 2.0. We report state-of-the-art results for the above mentioned tasks and corpora.
Tasks	Dependency Parsing, Lemmatization, Named Entity Recognition, Part-Of-Speech Tagging
Published	2019-09-08
URL	https://arxiv.org/abs/1909.03544v1
PDF	https://arxiv.org/pdf/1909.03544v1.pdf
PWC	https://paperswithcode.com/paper/czech-text-processing-with-contextual
Repo
Framework

Statistical Inference in Mean-Field Variational Bayes


Title	Statistical Inference in Mean-Field Variational Bayes
Authors	Wei Han, Yun Yang
Abstract	We conduct non-asymptotic analysis on the mean-field variational inference for approximating posterior distributions in complex Bayesian models that may involve latent variables. We show that the mean-field approximation to the posterior can be well-approximated relative to the Kullback-Leibler divergence discrepancy measure by a normal distribution whose center is the maximum likelihood estimator (MLE). In particular, our results imply that the center of the mean-field approximation matches the MLE up to higher-order terms and there is essentially no loss of efficiency in using it as a point estimator for the parameter in any regular parametric model with latent variables. We also propose a new class of variational weighted likelihood bootstrap (VWLB) methods for quantifying the uncertainty in the mean-field variational inference. The proposed VWLB can be viewed as a new sampling scheme that produces independent samples for approximating the posterior. Comparing with traditional sampling algorithms such Markov Chain Monte Carlo, VWLB can be implemented in parallel and is free of tuning.
Tasks
Published	2019-11-04
URL	https://arxiv.org/abs/1911.01525v1
PDF	https://arxiv.org/pdf/1911.01525v1.pdf
PWC	https://paperswithcode.com/paper/statistical-inference-in-mean-field
Repo
Framework