January 28, 2020

2890 words 14 mins read

Paper Group ANR 852

Cross-Domain Cascaded Deep Feature Translation. Assessing Partisan Traits of News Text Attributions. Zero-shot Dependency Parsing with Pre-trained Multilingual Sentence Representations. ReWE: Regressing Word Embeddings for Regularization of Neural Machine Translation Systems. Snowball: Iterative Model Evolution and Confident Sample Discovery for Se …

Cross-Domain Cascaded Deep Feature Translation


Title	Cross-Domain Cascaded Deep Feature Translation
Authors	Oren Katzir, Dani Lischinski, Daniel Cohen-Or
Abstract	In recent years we have witnessed tremendous progress in unpaired image-to-image translation methods, propelled by the emergence of DNNs and adversarial training strategies. However, most existing methods focus on transfer of style and appearance, rather than on shape translation. The latter task is challenging, due to its intricate non-local nature, which calls for additional supervision. We mitigate this by descending the deep layers of a pre-trained network, where the deep features contain more semantics, and applying the translation from and between these deep features. Specifically, we leverage VGG, which is a classification network, pre-trained with large-scale semantic supervision. Our translation is performed in a cascaded, deep-to-shallow, fashion, along the deep feature hierarchy: we first translate between the deepest layers that encode the higher-level semantic content of the image, proceeding to translate the shallower layers, conditioned on the deeper ones. We show that our method is able to translate between different domains, which exhibit significantly different shapes. We evaluate our method both qualitatively and quantitatively and compare it to state-of-the-art image-to-image translation methods. Our code and trained models will be made available.
Tasks	Image-to-Image Translation
Published	2019-06-04
URL	https://arxiv.org/abs/1906.01526v1
PDF	https://arxiv.org/pdf/1906.01526v1.pdf
PWC	https://paperswithcode.com/paper/cross-domain-cascaded-deep-feature
Repo
Framework

Assessing Partisan Traits of News Text Attributions


Title	Assessing Partisan Traits of News Text Attributions
Authors	Logan Martel, Edward Newell, Drew Margolin, Derek Ruths
Abstract	On the topic of journalistic integrity, the current state of accurate, impartial news reporting has garnered much debate in context to the 2016 US Presidential Election. In pursuit of computational evaluation of news text, the statements (attributions) ascribed by media outlets to sources provide a common category of evidence on which to operate. In this paper, we develop an approach to compare partisan traits of news text attributions and apply it to characterize differences in statements ascribed to candidate, Hilary Clinton, and incumbent President, Donald Trump. In doing so, we present a model trained on over 600 in-house annotated attributions to identify each candidate with accuracy > 88%. Finally, we discuss insights from its performance for future research.
Tasks
Published	2019-01-25
URL	http://arxiv.org/abs/1902.02179v1
PDF	http://arxiv.org/pdf/1902.02179v1.pdf
PWC	https://paperswithcode.com/paper/assessing-partisan-traits-of-news-text
Repo
Framework

Zero-shot Dependency Parsing with Pre-trained Multilingual Sentence Representations


Title	Zero-shot Dependency Parsing with Pre-trained Multilingual Sentence Representations
Authors	Ke Tran, Arianna Bisazza
Abstract	We investigate whether off-the-shelf deep bidirectional sentence representations trained on a massively multilingual corpus (multilingual BERT) enable the development of an unsupervised universal dependency parser. This approach only leverages a mix of monolingual corpora in many languages and does not require any translation data making it applicable to low-resource languages. In our experiments we outperform the best CoNLL 2018 language-specific systems in all of the shared task’s six truly low-resource languages while using a single system. However, we also find that (i) parsing accuracy still varies dramatically when changing the training languages and (ii) in some target languages zero-shot transfer fails under all tested conditions, raising concerns on the ‘universality’ of the whole approach.
Tasks	Dependency Parsing
Published	2019-10-12
URL	https://arxiv.org/abs/1910.05479v1
PDF	https://arxiv.org/pdf/1910.05479v1.pdf
PWC	https://paperswithcode.com/paper/zero-shot-dependency-parsing-with-pre-trained
Repo
Framework

ReWE: Regressing Word Embeddings for Regularization of Neural Machine Translation Systems


Title	ReWE: Regressing Word Embeddings for Regularization of Neural Machine Translation Systems
Authors	Inigo Jauregi Unanue, Ehsan Zare Borzeshi, Nazanin Esmaili, Massimo Piccardi
Abstract	Regularization of neural machine translation is still a significant problem, especially in low-resource settings. To mollify this problem, we propose regressing word embeddings (ReWE) as a new regularization technique in a system that is jointly trained to predict the next word in the translation (categorical value) and its word embedding (continuous value). Such a joint training allows the proposed system to learn the distributional properties represented by the word embeddings, empirically improving the generalization to unseen sentences. Experiments over three translation datasets have showed a consistent improvement over a strong baseline, ranging between 0.91 and 2.54 BLEU points, and also a marked improvement over a state-of-the-art system.
Tasks	Machine Translation, Word Embeddings
Published	2019-04-04
URL	http://arxiv.org/abs/1904.02461v1
PDF	http://arxiv.org/pdf/1904.02461v1.pdf
PWC	https://paperswithcode.com/paper/rewe-regressing-word-embeddings-for
Repo
Framework

Snowball: Iterative Model Evolution and Confident Sample Discovery for Semi-Supervised Learning on Very Small Labeled Datasets


Title	Snowball: Iterative Model Evolution and Confident Sample Discovery for Semi-Supervised Learning on Very Small Labeled Datasets
Authors	Yang Li, Jianhe Yuan, Zhiqun Zhao, Hao Sun, Zhihai He
Abstract	In this work, we develop a joint sample discovery and iterative model evolution method for semi-supervised learning on very small labeled training sets. We propose a master-teacher-student model framework to provide multi-layer guidance during the model evolution process with multiple iterations and generations. The teacher model is constructed by performing an exponential moving average of the student models obtained from past training steps. The master network combines the knowledge of the student and teacher models with additional access to newly discovered samples. The master and teacher models are then used to guide the training of the student network by enforcing the consistence between their predictions of unlabeled samples and evolve all models when more and more samples are discovered. Our extensive experiments demonstrate that the discovering confident samples from the unlabeled dataset, once coupled with the above master-teacher-student network evolution, can significantly improve the overall semi-supervised learning performance. For example, on the CIFAR-10 dataset, with a very small set of 250 labeled samples, our method achieves an error rate of 11.81 %, more than 38 % lower than the state-of-the-art method Mean-Teacher (49.91 %).
Tasks
Published	2019-09-04
URL	https://arxiv.org/abs/1909.01542v1
PDF	https://arxiv.org/pdf/1909.01542v1.pdf
PWC	https://paperswithcode.com/paper/snowball-iterative-model-evolution-and
Repo
Framework

A Broad Class of Discrete-Time Hypercomplex-Valued Hopfield Neural Networks


Title	A Broad Class of Discrete-Time Hypercomplex-Valued Hopfield Neural Networks
Authors	Fidelis Zanetti de Castro, Marcos Eduardo Valle
Abstract	In this paper, we address the stability of a broad class of discrete-time hypercomplex-valued Hopfield-type neural networks. To ensure the neural networks belonging to this class always settle down at a stationary state, we introduce novel hypercomplex number systems referred to as real-part associative hypercomplex number systems. Real-part associative hypercomplex number systems generalize the well-known Cayley-Dickson algebras and real Clifford algebras and include the systems of real numbers, complex numbers, dual numbers, hyperbolic numbers, quaternions, tessarines, and octonions as particular instances. Apart from the novel hypercomplex number systems, we introduce a family of hypercomplex-valued activation functions called $\mathcal{B}$-projection functions. Broadly speaking, a $\mathcal{B}$-projection function projects the activation potential onto the set of all possible states of a hypercomplex-valued neuron. Using the theory presented in this paper, we confirm the stability analysis of several discrete-time hypercomplex-valued Hopfield-type neural networks from the literature. Moreover, we introduce and provide the stability analysis of a general class of Hopfield-type neural networks on Cayley-Dickson algebras.
Tasks
Published	2019-02-14
URL	https://arxiv.org/abs/1902.05478v3
PDF	https://arxiv.org/pdf/1902.05478v3.pdf
PWC	https://paperswithcode.com/paper/a-broad-class-of-discrete-time-hypercomplex
Repo
Framework

Intra-Ensemble in Neural Networks


Title	Intra-Ensemble in Neural Networks
Authors	Yuan Gao, Zixiang Cai, Yimin Chen, Wenke Chen, Kan Yang, Chen Sun, Cong Yao
Abstract	Improving model performance is always the key problem in machine learning including deep learning. However, stand-alone neural networks always suffer from marginal effect when stacking more layers. At the same time, ensemble is a useful technique to further enhance model performance. Nevertheless, training several independent stand-alone deep neural networks costs multiple resources. In this work, we propose Intra-Ensemble, an end-to-end strategy with stochastic training operations to train several sub-networks simultaneously within one neural network. Additional parameter size is marginal since the majority of parameters are mutually shared. Meanwhile, stochastic training increases the diversity of sub-networks with weight sharing, which significantly enhances intra-ensemble performance. Extensive experiments prove the applicability of intra-ensemble on various kinds of datasets and network architectures. Our models achieve comparable results with the state-of-the-art architectures on CIFAR-10 and CIFAR-100.
Tasks
Published	2019-04-09
URL	http://arxiv.org/abs/1904.04466v1
PDF	http://arxiv.org/pdf/1904.04466v1.pdf
PWC	https://paperswithcode.com/paper/intra-ensemble-in-neural-networks
Repo
Framework

Deep Learning the EEG Manifold for Phonological Categorization from Active Thoughts


Title	Deep Learning the EEG Manifold for Phonological Categorization from Active Thoughts
Authors	Pramit Saha, Muhammad Abdul-Mageed, Sidney Fels
Abstract	Speech-related Brain Computer Interfaces (BCI) aim primarily at finding an alternative vocal communication pathway for people with speaking disabilities. As a step towards full decoding of imagined speech from active thoughts, we present a BCI system for subject-independent classification of phonological categories exploiting a novel deep learning based hierarchical feature extraction scheme. To better capture the complex representation of high-dimensional electroencephalography (EEG) data, we compute the joint variability of EEG electrodes into a channel cross-covariance matrix. We then extract the spatio-temporal information encoded within the matrix using a mixed deep neural network strategy. Our model framework is composed of a convolutional neural network (CNN), a long-short term network (LSTM), and a deep autoencoder. We train the individual networks hierarchically, feeding their combined outputs in a final gradient boosting classification step. Our best models achieve an average accuracy of 77.9% across five different binary classification tasks, providing a significant 22.5% improvement over previous methods. As we also show visually, our work demonstrates that the speech imagery EEG possesses significant discriminative information about the intended articulatory movements responsible for natural speech synthesis.
Tasks	EEG, Speech Synthesis
Published	2019-04-08
URL	http://arxiv.org/abs/1904.04358v1
PDF	http://arxiv.org/pdf/1904.04358v1.pdf
PWC	https://paperswithcode.com/paper/deep-learning-the-eeg-manifold-for
Repo
Framework

Arterial incident duration prediction using a bi-level framework of extreme gradient-tree boosting


Title	Arterial incident duration prediction using a bi-level framework of extreme gradient-tree boosting
Authors	Adriana-Simona Mihaita, Zheyuan Liu, Chen Cai, Marian-Andrei Rizoiu
Abstract	Predicting traffic incident duration is a major challenge for many traffic centres around the world. Most research studies focus on predicting the incident duration on motorways rather than arterial roads, due to a high network complexity and lack of data. In this paper we propose a bi-level framework for predicting the accident duration on arterial road networks in Sydney, based on operational requirements of incident clearance target which is less than 45 minutes. Using incident baseline information, we first deploy a classification method using various ensemble tree models in order to predict whether a new incident will be cleared in less than 45min or not. If the incident was classified as short-term, then various regression models are developed for predicting the actual incident duration in minutes by incorporating various traffic flow features. After outlier removal and intensive model hyper-parameter tuning through randomized search and cross-validation, we show that the extreme gradient boost approach outperformed all models, including the gradient-boosted decision-trees by almost 53%. Finally, we perform a feature importance evaluation for incident duration prediction and show that the best prediction results are obtained when leveraging the real-time traffic flow in vicinity road sections to the reported accident location.
Tasks	Feature Importance
Published	2019-05-29
URL	https://arxiv.org/abs/1905.12254v1
PDF	https://arxiv.org/pdf/1905.12254v1.pdf
PWC	https://paperswithcode.com/paper/arterial-incident-duration-prediction-using-a
Repo
Framework

Recognition of Advertisement Emotions with Application to Computational Advertising


Title	Recognition of Advertisement Emotions with Application to Computational Advertising
Authors	Abhinav Shukla, Shruti Shriya Gullapuram, Harish Katti, Mohan Kankanhalli, Stefan Winkler, Ramanathan Subramanian
Abstract	Advertisements (ads) often contain strong affective content to capture viewer attention and convey an effective message to the audience. However, most computational affect recognition (AR) approaches examine ads via the text modality, and only limited work has been devoted to decoding ad emotions from audiovisual or user cues. This work (1) compiles an affective ad dataset capable of evoking coherent emotions across users; (2) explores the efficacy of content-centric convolutional neural network (CNN) features for AR vis-~a-vis handcrafted audio-visual descriptors; (3) examines user-centric ad AR from Electroencephalogram (EEG) responses acquired during ad-viewing, and (4) demonstrates how better affect predictions facilitate effective computational advertising as determined by a study involving 18 users. Experiments reveal that (a) CNN features outperform audiovisual descriptors for content-centric AR; (b) EEG features are able to encode ad-induced emotions better than content-based features; (c) Multi-task learning performs best among a slew of classification algorithms to achieve optimal AR, and (d) Pursuant to (b), EEG features also enable optimized ad insertion onto streamed video, as compared to content-based or manual insertion techniques in terms of ad memorability and overall user experience.
Tasks	EEG, Multi-Task Learning
Published	2019-04-03
URL	http://arxiv.org/abs/1904.01778v1
PDF	http://arxiv.org/pdf/1904.01778v1.pdf
PWC	https://paperswithcode.com/paper/recognition-of-advertisement-emotions-with
Repo
Framework

Incremental Reinforcement Learning — a New Continuous Reinforcement Learning Frame Based on Stochastic Differential Equation methods


Title	Incremental Reinforcement Learning — a New Continuous Reinforcement Learning Frame Based on Stochastic Differential Equation methods
Authors	Tianhao Chen, Limei Cheng, Yang Liu, Wenchuan Jia, Shugen Ma
Abstract	Continuous reinforcement learning such as DDPG and A3C are widely used in robot control and autonomous driving. However, both methods have theoretical weaknesses. While DDPG cannot control noises in the control process, A3C does not satisfy the continuity conditions under the Gaussian policy. To address these concerns, we propose a new continues reinforcement learning method based on stochastic differential equations and we call it Incremental Reinforcement Learning (IRL). This method not only guarantees the continuity of actions within any time interval, but controls the variance of actions in the training process. In addition, our method does not assume Markov control in agents’ action control and allows agents to predict scene changes for action selection. With our method, agents no longer passively adapt to the environment. Instead, they positively interact with the environment for maximum rewards.
Tasks	Autonomous Driving
Published	2019-08-08
URL	https://arxiv.org/abs/1908.02974v1
PDF	https://arxiv.org/pdf/1908.02974v1.pdf
PWC	https://paperswithcode.com/paper/incremental-reinforcement-learning-a-new
Repo
Framework

Interpreting a Recurrent Neural Network Model for ICU Mortality Using Learned Binary Masks


Title	Interpreting a Recurrent Neural Network Model for ICU Mortality Using Learned Binary Masks
Authors	Long V. Ho, Melissa D. Aczon, David Ledbetter, Randall Wetzel
Abstract	An attribution method was developed to interpret a recurrent neural network (RNN) trained to predict a child’s risk of ICU mortality using multi-modal, time series data in the Electronic Medical Records. By learning a sparse, binary mask that highlights salient features of the input data, critical features determining an individual patient’s severity of illness could be identified. The method, called Learned Binary Masks (LBM), demonstrated that the RNN used different feature sets specific to each patient’s illness; and further, the features highlighted aligned with clinical intuition of the patient’s disease trajectories. LBM was also used to identify the most salient features across the model, analogous to “feature importance” computed in the Random Forest. This measure of the RNN’s feature importance was further used to select the 25% most used features for training a second RNN model. Interestingly, but not surprisingly, the second model maintained similar performance to the model trained on all features. LBM is data-agnostic and can be used to interpret the predictions of any differentiable model.
Tasks	Feature Importance, Time Series
Published	2019-05-23
URL	https://arxiv.org/abs/1905.09865v2
PDF	https://arxiv.org/pdf/1905.09865v2.pdf
PWC	https://paperswithcode.com/paper/interpreting-a-recurrent-neural-network-model
Repo
Framework

Naive probability


Title	Naive probability
Authors	Zalan Gyenis, Andras Kornai
Abstract	We describe a rational, but low resolution model of probability.
Tasks
Published	2019-05-20
URL	https://arxiv.org/abs/1905.10924v1
PDF	https://arxiv.org/pdf/1905.10924v1.pdf
PWC	https://paperswithcode.com/paper/190510924
Repo
Framework

Hierarchical Target-Attentive Diagnosis Prediction in Heterogeneous Information Networks


Title	Hierarchical Target-Attentive Diagnosis Prediction in Heterogeneous Information Networks
Authors	Anahita Hosseini, Tyler Davis, Majid Sarrafzadeh
Abstract	We introduce HTAD, a novel model for diagnosis prediction using Electronic Health Records (EHR) represented as Heterogeneous Information Networks. Recent studies on modeling EHR have shown success in automatically learning representations of the clinical records in order to avoid the need for manual feature selection. However, these representations are often learned and aggregated without specificity for the different possible targets being predicted. Our model introduces a target-aware hierarchical attention mechanism that allows it to learn to attend to the most important clinical records when aggregating their representations for prediction of a diagnosis. We evaluate our model using a publicly available benchmark dataset and demonstrate that the use of target-aware attention significantly improves performance compared to the current state of the art. Additionally, we propose a method for incorporating non-categorical data into our predictions and demonstrate that this technique leads to further performance improvements. Lastly, we demonstrate that the predictions made by our proposed model are easily interpretable.
Tasks	Feature Selection
Published	2019-12-22
URL	https://arxiv.org/abs/1912.10552v1
PDF	https://arxiv.org/pdf/1912.10552v1.pdf
PWC	https://paperswithcode.com/paper/hierarchical-target-attentive-diagnosis
Repo
Framework

Distinguishing Clinical Sentiment: The Importance of Domain Adaptation in Psychiatric Patient Health Records


Title	Distinguishing Clinical Sentiment: The Importance of Domain Adaptation in Psychiatric Patient Health Records
Authors	Eben Holderness, Philip Cawkwell, Kirsten Bolton, James Pustejovsky, Mei-Hua Hall
Abstract	Recently natural language processing (NLP) tools have been developed to identify and extract salient risk indicators in electronic health records (EHRs). Sentiment analysis, although widely used in non-medical areas for improving decision making, has been studied minimally in the clinical setting. In this study, we undertook, to our knowledge, the first domain adaptation of sentiment analysis to psychiatric EHRs by defining psychiatric clinical sentiment, performing an annotation project, and evaluating multiple sentence-level sentiment machine learning (ML) models. Results indicate that off-the-shelf sentiment analysis tools fail in identifying clinically positive or negative polarity, and that the definition of clinical sentiment that we provide is learnable with relatively small amounts of training data. This project is an initial step towards further refining sentiment analysis methods for clinical use. Our long-term objective is to incorporate the results of this project as part of a machine learning model that predicts inpatient readmission risk. We hope that this work will initiate a discussion concerning domain adaptation of sentiment analysis to the clinical setting.
Tasks	Decision Making, Domain Adaptation, Sentiment Analysis
Published	2019-04-05
URL	http://arxiv.org/abs/1904.03225v1
PDF	http://arxiv.org/pdf/1904.03225v1.pdf
PWC	https://paperswithcode.com/paper/distinguishing-clinical-sentiment-the
Repo
Framework