Paper Group NANR 117
Ballistocardiogram artifact reduction in simultaneous EEG-fMRI using deep learning. NCUEE at MEDIQA 2019: Medical Text Inference Using Ensemble BERT-BiLSTM-Attention Model. Did It Change? Learning to Detect Point-Of-Interest Changes for Proactive Map Updates. Combining Local and Document-Level Context: The LMU Munich Neural Machine Translation Syst …
Ballistocardiogram artifact reduction in simultaneous EEG-fMRI using deep learning
Title | Ballistocardiogram artifact reduction in simultaneous EEG-fMRI using deep learning |
Authors | J. R. McIntosh, J. Yao, Linbi Hong, J. Faller, P. Sajda |
Abstract | Objective: The concurrent recording of electroencephalography (EEG) and functional magnetic resonance imaging (fMRI) is a technique that has received much attention due to its potential for combined high temporal and spatial resolution. However, the ballistocardiogram (BCG), a large-amplitude artifact caused by cardiac induced movement contaminates the EEG during EEG-fMRI recordings. Removal of BCG in software has generally made use of linear decompositions of the corrupted EEG. This is not ideal as the BCG signal is non-stationary and propagates in a manner which is non-linearly dependent on the electrocardiogram (ECG). In this paper, we present a novel method (BCGNet) for BCG artifact suppression using recurrent neural networks (RNNs). Methods: EEG signals were recovered by training RNNs on the nonlinear mappings between ECG and the BCG corrupted EEG. We evaluated our model’s performance against the commonly used Optimal Basis Set (OBS) method at the level of individual subjects, and investigated generalization across subjects. Results: We show that our algorithm can generate larger average power reduction of the BCG at critical frequencies, while simultaneously improving task relevant EEG based classification. Conclusion: The presented deep learning architecture can be used to reduce BCG related artifacts in EEG-fMRI recordings. Significance: We present a deep learning approach that can be used to suppress the BCG artifact in EEG-fMRI without the use of additional hardware. This method may have scope to be combined with current hardware methods, operate in real-time and be used for direct modeling of the BCG. |
Tasks | EEG, EEG Artifact Removal, Electrocardiography (ECG) |
Published | 2019-10-15 |
URL | https://arxiv.org/abs/1910.06659 |
https://arxiv.org/pdf/1910.06659 | |
PWC | https://paperswithcode.com/paper/ballistocardiogram-artifact-reduction-in |
Repo | |
Framework | |
NCUEE at MEDIQA 2019: Medical Text Inference Using Ensemble BERT-BiLSTM-Attention Model
Title | NCUEE at MEDIQA 2019: Medical Text Inference Using Ensemble BERT-BiLSTM-Attention Model |
Authors | Lung-Hao Lee, Yi Lu, Po-Han Chen, Po-Lei Lee, Kuo-Kai Shyu |
Abstract | This study describes the model design of the NCUEE system for the MEDIQA challenge at the ACL-BioNLP 2019 workshop. We use the BERT (Bidirectional Encoder Representations from Transformers) as the word embedding method to integrate the BiLSTM (Bidirectional Long Short-Term Memory) network with an attention mechanism for medical text inferences. A total of 42 teams participated in natural language inference task at MEDIQA 2019. Our best accuracy score of 0.84 ranked the top-third among all submissions in the leaderboard. |
Tasks | Natural Language Inference |
Published | 2019-08-01 |
URL | https://www.aclweb.org/anthology/W19-5058/ |
https://www.aclweb.org/anthology/W19-5058 | |
PWC | https://paperswithcode.com/paper/ncuee-at-mediqa-2019-medical-text-inference |
Repo | |
Framework | |
Did It Change? Learning to Detect Point-Of-Interest Changes for Proactive Map Updates
Title | Did It Change? Learning to Detect Point-Of-Interest Changes for Proactive Map Updates |
Authors | Jerome Revaud, Minhyeok Heo, Rafael S. Rezende, Chanmi You, Seong-Gyun Jeong |
Abstract | Maps are an increasingly important tool in our daily lives, yet their rich semantic content still largely depends on manual input. Motivated by the broad availability of geo-tagged street-view images, we propose a new task aiming to make the map update process more proactive. We focus on automatically detecting changes of Points of Interest (POIs), specifically stores or shops of any kind, based on visual input. Faced with the lack of an appropriate benchmark, we build and release a large dataset, captured in two large shopping centers, that comprises 33K geo-localized images and 578 POIs. We then design a generic approach that compares two image sets captured in the same venue at different times and outputs POI changes as a ranked list of map locations. In contrast to logo or franchise recognition approaches, our system does not depend on an external franchise database. It is instead inspired by recent deep metric learning approaches that learn a similarity function fit to the task at hand. We compare various loss functions to learn a metric aligned with the POI change detection goal, and report promising results. |
Tasks | Metric Learning |
Published | 2019-06-01 |
URL | http://openaccess.thecvf.com/content_CVPR_2019/html/Revaud_Did_It_Change_Learning_to_Detect_Point-Of-Interest_Changes_for_Proactive_CVPR_2019_paper.html |
http://openaccess.thecvf.com/content_CVPR_2019/papers/Revaud_Did_It_Change_Learning_to_Detect_Point-Of-Interest_Changes_for_Proactive_CVPR_2019_paper.pdf | |
PWC | https://paperswithcode.com/paper/did-it-change-learning-to-detect-point-of |
Repo | |
Framework | |
Combining Local and Document-Level Context: The LMU Munich Neural Machine Translation System at WMT19
Title | Combining Local and Document-Level Context: The LMU Munich Neural Machine Translation System at WMT19 |
Authors | Dario Stojanovski, Alex Fraser, er |
Abstract | We describe LMU Munich{'}s machine translation system for English→German translation which was used to participate in the WMT19 shared task on supervised news translation. We specifically participated in the document-level MT track. The system used as a primary submission is a context-aware Transformer capable of both rich modeling of limited contextual information and integration of large-scale document-level context with a less rich representation. We train this model by fine-tuning a big Transformer baseline. Our experimental results show that document-level context provides for large improvements in translation quality, and adding a rich representation of the previous sentence provides a small additional gain. |
Tasks | Machine Translation |
Published | 2019-08-01 |
URL | https://www.aclweb.org/anthology/W19-5345/ |
https://www.aclweb.org/anthology/W19-5345 | |
PWC | https://paperswithcode.com/paper/combining-local-and-document-level-context |
Repo | |
Framework | |
Learning Goal-Conditioned Value Functions with one-step Path rewards rather than Goal-Rewards
Title | Learning Goal-Conditioned Value Functions with one-step Path rewards rather than Goal-Rewards |
Authors | Vikas Dhiman, Shurjo Banerjee, Jeffrey M Siskind, Jason J Corso |
Abstract | Multi-goal reinforcement learning (MGRL) addresses tasks where the desired goal state can change for every trial. State-of-the-art algorithms model these problems such that the reward formulation depends on the goals, to associate them with high reward. This dependence introduces additional goal reward resampling steps in algorithms like Hindsight Experience Replay (HER) that reuse trials in which the agent fails to reach the goal by recomputing rewards as if reached states were psuedo-desired goals. We propose a reformulation of goal-conditioned value functions for MGRL that yields a similar algorithm, while removing the dependence of reward functions on the goal. Our formulation thus obviates the requirement of reward-recomputation that is needed by HER and its extensions. We also extend a closely related algorithm, Floyd-Warshall Reinforcement Learning, from tabular domains to deep neural networks for use as a baseline. Our results are competetive with HER while substantially improving sampling efficiency in terms of reward computation. |
Tasks | Multi-Goal Reinforcement Learning |
Published | 2019-05-01 |
URL | https://openreview.net/forum?id=BkesGnCcFX |
https://openreview.net/pdf?id=BkesGnCcFX | |
PWC | https://paperswithcode.com/paper/learning-goal-conditioned-value-functions |
Repo | |
Framework | |
RESIDUAL NETWORKS CLASSIFY INPUTS BASED ON THEIR NEURAL TRANSIENT DYNAMICS
Title | RESIDUAL NETWORKS CLASSIFY INPUTS BASED ON THEIR NEURAL TRANSIENT DYNAMICS |
Authors | Fereshteh Lagzi |
Abstract | In this study, we analyze the input-output behavior of residual networks from a dynamical system point of view by disentangling the residual dynamics from the output activities before the classification stage. For a network with simple skip connections between every successive layer, and for logistic activation function, and shared weights between layers, we show analytically that there is a cooperation and competition dynamics between residuals corresponding to each input dimension. Interpreting these kind of networks as nonlinear filters, the steady state value of the residuals in the case of attractor networks are indicative of the common features between different input dimensions that the network has observed during training, and has encoded in those components. In cases where residuals do not converge to an attractor state, their internal dynamics are separable for each input class, and the network can reliably approximate the output. We bring analytical and empirical evidence that residual networks classify inputs based on the integration of the transient dynamics of the residuals, and will show how the network responds to input perturbations. We compare the network dynamics for a ResNet and a Multi-Layer Perceptron and show that the internal dynamics, and the noise evolution are fundamentally different in these networks, and ResNets are more robust to noisy inputs. Based on these findings, we also develop a new method to adjust the depth for residual networks during training. As it turns out, after pruning the depth of a ResNet using this algorithm,the network is still capable of classifying inputs with a high accuracy. |
Tasks | |
Published | 2019-05-01 |
URL | https://openreview.net/forum?id=Hkl84iCcFm |
https://openreview.net/pdf?id=Hkl84iCcFm | |
PWC | https://paperswithcode.com/paper/residual-networks-classify-inputs-based-on |
Repo | |
Framework | |
Poetry to Prose Conversion in Sanskrit as a Linearisation Task: A Case for Low-Resource Languages
Title | Poetry to Prose Conversion in Sanskrit as a Linearisation Task: A Case for Low-Resource Languages |
Authors | Amrith Krishna, Vishnu Sharma, Bishal Santra, Aishik Chakraborty, Pavankumar Satuluri, Pawan Goyal |
Abstract | The word ordering in a Sanskrit verse is often not aligned with its corresponding prose order. Conversion of the verse to its corresponding prose helps in better comprehension of the construction. Owing to the resource constraints, we formulate this task as a word ordering (linearisation) task. In doing so, we completely ignore the word arrangement at the verse side. k{=a}vya guru, the approach we propose, essentially consists of a pipeline of two pretraining steps followed by a seq2seq model. The first pretraining step learns task-specific token embeddings from pretrained embeddings. In the next step, we generate multiple possible hypotheses for possible word arrangements of the input {%}using another pretraining step. We then use them as inputs to a neural seq2seq model for the final prediction. We empirically show that the hypotheses generated by our pretraining step result in predictions that consistently outperform predictions based on the original order in the verse. Overall, k{=a}vya guru outperforms current state of the art models in linearisation for the poetry to prose conversion task in Sanskrit. |
Tasks | |
Published | 2019-07-01 |
URL | https://www.aclweb.org/anthology/P19-1111/ |
https://www.aclweb.org/anthology/P19-1111 | |
PWC | https://paperswithcode.com/paper/poetry-to-prose-conversion-in-sanskrit-as-a |
Repo | |
Framework | |
Show, Describe and Conclude: On Exploiting the Structure Information of Chest X-ray Reports
Title | Show, Describe and Conclude: On Exploiting the Structure Information of Chest X-ray Reports |
Authors | Baoyu Jing, Zeya Wang, Eric Xing |
Abstract | Chest X-Ray (CXR) images are commonly used for clinical screening and diagnosis. Automatically writing reports for these images can considerably lighten the workload of radiologists for summarizing descriptive findings and conclusive impressions. The complex structures between and within sections of the reports pose a great challenge to the automatic report generation. Specifically, the section Impression is a diagnostic summarization over the section Findings; and the appearance of normality dominates each section over that of abnormality. Existing studies rarely explore and consider this fundamental structure information. In this work, we propose a novel framework which exploits the structure information between and within report sections for generating CXR imaging reports. First, we propose a two-stage strategy that explicitly models the relationship between Findings and Impression. Second, we design a novel co-operative multi-agent system that implicitly captures the imbalanced distribution between abnormality and normality. Experiments on two CXR report datasets show that our method achieves state-of-the-art performance in terms of various evaluation metrics. Our results expose that the proposed approach is able to generate high-quality medical reports through integrating the structure information. |
Tasks | |
Published | 2019-07-01 |
URL | https://www.aclweb.org/anthology/P19-1657/ |
https://www.aclweb.org/anthology/P19-1657 | |
PWC | https://paperswithcode.com/paper/show-describe-and-conclude-on-exploiting-the |
Repo | |
Framework | |
Reducing Overconfident Errors outside the Known Distribution
Title | Reducing Overconfident Errors outside the Known Distribution |
Authors | Zhizhong Li, Derek Hoiem |
Abstract | Intuitively, unfamiliarity should lead to lack of confidence. In reality, current algorithms often make highly confident yet wrong predictions when faced with unexpected test samples from an unknown distribution different from training. Unlike domain adaptation methods, we cannot gather an “unexpected dataset” prior to test, and unlike novelty detection methods, a best-effort original task prediction is still expected. We compare a number of methods from related fields such as calibration and epistemic uncertainty modeling, as well as two proposed methods that reduce overconfident errors of samples from an unknown novel distribution without drastically increasing evaluation time: (1) G-distillation, training an ensemble of classifiers and then distill into a single model using both labeled and unlabeled examples, or (2) NCR, reducing prediction confidence based on its novelty detection score. Experimentally, we investigate the overconfidence problem and evaluate our solution by creating “familiar” and “novel” test splits, where “familiar” are identically distributed with training and “novel” are not. We discover that calibrating using temperature scaling on familiar data is the best single-model method for improving novel confidence, followed by our proposed methods. In addition, some methods’ NLL performance are roughly equivalent to a regularly trained model with certain degree of smoothing. Calibrating can also reduce confident errors, for example, in gender recognition by 95% on demographic groups different from the training data. |
Tasks | Calibration, Domain Adaptation |
Published | 2019-05-01 |
URL | https://openreview.net/forum?id=S1giro05t7 |
https://openreview.net/pdf?id=S1giro05t7 | |
PWC | https://paperswithcode.com/paper/reducing-overconfident-errors-outside-the |
Repo | |
Framework | |
A Similarity-preserving Network Trained on Transformed Images Recapitulates Salient Features of the Fly Motion Detection Circuit
Title | A Similarity-preserving Network Trained on Transformed Images Recapitulates Salient Features of the Fly Motion Detection Circuit |
Authors | Yanis Bahroun, Dmitri Chklovskii, Anirvan Sengupta |
Abstract | Learning to detect content-independent transformations from data is one of the central problems in biological and artificial intelligence. An example of such problem is unsupervised learning of a visual motion detector from pairs of consecutive video frames. Rao and Ruderman formulated this problem in terms of learning infinitesimal transformation operators (Lie group generators) via minimizing image reconstruction error. Unfortunately, it is difficult to map their model onto a biologically plausible neural network (NN) with local learning rules. Here we propose a biologically plausible model of motion detection. We also adopt the transformation-operator approach but, instead of reconstruction-error minimization, start with a similarity-preserving objective function. An online algorithm that optimizes such an objective function naturally maps onto an NN with biologically plausible learning rules. The trained NN recapitulates major features of the well-studied motion detector in the fly. In particular, it is consistent with the experimental observation that local motion detectors combine information from at least three adjacent pixels, something that contradicts the celebrated Hassenstein-Reichardt model. |
Tasks | Image Reconstruction, Motion Detection |
Published | 2019-12-01 |
URL | http://papers.nips.cc/paper/9566-a-similarity-preserving-network-trained-on-transformed-images-recapitulates-salient-features-of-the-fly-motion-detection-circuit |
http://papers.nips.cc/paper/9566-a-similarity-preserving-network-trained-on-transformed-images-recapitulates-salient-features-of-the-fly-motion-detection-circuit.pdf | |
PWC | https://paperswithcode.com/paper/a-similarity-preserving-network-trained-on |
Repo | |
Framework | |
Monge-Amp`ere Flow for Generative Modeling
Title | Monge-Amp`ere Flow for Generative Modeling |
Authors | Linfeng Zhang, Weinan E, Lei Wang |
Abstract | We present a deep generative model, named Monge-Amp`ere flow, which builds on continuous-time gradient flow arising from the Monge-Amp`ere equation in optimal transport theory. The generative map from the latent space to the data space follows a dynamical system, where a learnable potential function guides a compressible fluid to flow towards the target density distribution. Training of the model amounts to solving an optimal control problem. The Monge-Amp`ere flow has tractable likelihoods and supports efficient sampling and inference. One can easily impose symmetry constraints in the generative model by designing suitable scalar potential functions. We apply the approach to unsupervised density estimation of the MNIST dataset and variational calculation of the two-dimensional Ising model at the critical point. This approach brings insights and techniques from Monge-Amp`ere equation, optimal transport, and fluid dynamics into reversible flow-based generative models. |
Tasks | Density Estimation |
Published | 2019-05-01 |
URL | https://openreview.net/forum?id=rkeUrjCcYQ |
https://openreview.net/pdf?id=rkeUrjCcYQ | |
PWC | https://paperswithcode.com/paper/monge-ampere-flow-for-generative-modeling-1 |
Repo | |
Framework | |
The loss landscape of overparameterized neural networks
Title | The loss landscape of overparameterized neural networks |
Authors | Y. Cooper |
Abstract | We explore some mathematical features of the loss landscape of overparameterized neural networks. A priori one might imagine that the loss function looks like a typical function from $\mathbb{R}^n$ to $\mathbb{R}$ - in particular, nonconvex, with discrete global minima. In this paper, we prove that in at least one important way, the loss function of an overparameterized neural network does not look like a typical function. If a neural net has $n$ parameters and is trained on $d$ data points, with $n>d$, we show that the locus $M$ of global minima of $L$ is usually not discrete, but rather an $n-d$ dimensional submanifold of $\mathbb{R}^n$. In practice, neural nets commonly have orders of magnitude more parameters than data points, so this observation implies that $M$ is typically a very high-dimensional subset of $\mathbb{R}^n$. |
Tasks | |
Published | 2019-05-01 |
URL | https://openreview.net/forum?id=SyezvsC5tX |
https://openreview.net/pdf?id=SyezvsC5tX | |
PWC | https://paperswithcode.com/paper/the-loss-landscape-of-overparameterized |
Repo | |
Framework | |
Fitting Multiple Heterogeneous Models by Multi-Class Cascaded T-Linkage
Title | Fitting Multiple Heterogeneous Models by Multi-Class Cascaded T-Linkage |
Authors | Luca Magri, Andrea Fusiello |
Abstract | This paper addresses the problem of multiple models fitting in the general context where the sought structures can be described by a mixture of heterogeneous parametric models drawn from different classes. To this end, we conceive a multi-model selection framework that extend T-linkage to cope with different nested class of models. Our method, called MCT, compares favourably with the state-of-the-art on publicly available data-sets for various fitting problems: lines and conics, homographies and fundamental matrices, planes and cylinders. |
Tasks | Model Selection |
Published | 2019-06-01 |
URL | http://openaccess.thecvf.com/content_CVPR_2019/html/Magri_Fitting_Multiple_Heterogeneous_Models_by_Multi-Class_Cascaded_T-Linkage_CVPR_2019_paper.html |
http://openaccess.thecvf.com/content_CVPR_2019/papers/Magri_Fitting_Multiple_Heterogeneous_Models_by_Multi-Class_Cascaded_T-Linkage_CVPR_2019_paper.pdf | |
PWC | https://paperswithcode.com/paper/fitting-multiple-heterogeneous-models-by |
Repo | |
Framework | |
Adversarial Attacks on Node Embeddings
Title | Adversarial Attacks on Node Embeddings |
Authors | Aleksandar Bojchevski, Stephan Günnemann |
Abstract | The goal of network representation learning is to learn low-dimensional node embeddings that capture the graph structure and are useful for solving downstream tasks. However, despite the proliferation of such methods there is currently no study of their robustness to adversarial attacks. We provide the first adversarial vulnerability analysis on the widely used family of methods based on random walks. We derive efficient adversarial perturbations that poison the network structure and have a negative effect on both the quality of the embeddings and the downstream tasks. We further show that our attacks are transferable since they generalize to many models, and are successful even when the attacker is restricted. |
Tasks | Representation Learning |
Published | 2019-05-01 |
URL | https://openreview.net/forum?id=Sye7qoC5FQ |
https://openreview.net/pdf?id=Sye7qoC5FQ | |
PWC | https://paperswithcode.com/paper/adversarial-attacks-on-node-embeddings-1 |
Repo | |
Framework | |
Proceedings of the Fourth Social Media Mining for Health Applications (#SMM4H) Workshop & Shared Task
Title | Proceedings of the Fourth Social Media Mining for Health Applications (#SMM4H) Workshop & Shared Task |
Authors | |
Abstract | |
Tasks | |
Published | 2019-08-01 |
URL | https://www.aclweb.org/anthology/W19-3200/ |
https://www.aclweb.org/anthology/W19-3200 | |
PWC | https://paperswithcode.com/paper/proceedings-of-the-fourth-social-media-mining |
Repo | |
Framework | |