October 16, 2019

3226 words 16 mins read

Paper Group ANR 1131

Paper Group ANR 1131

Unsupervised brain lesion segmentation from MRI using a convolutional autoencoder. MIDI-VAE: Modeling Dynamics and Instrumentation of Music with Applications to Style Transfer. GD-GAN: Generative Adversarial Networks for Trajectory Prediction and Group Detection in Crowds. Optimizing Channel Selection for Seizure Detection. Understanding and correc …

Unsupervised brain lesion segmentation from MRI using a convolutional autoencoder

Title Unsupervised brain lesion segmentation from MRI using a convolutional autoencoder
Authors Hans E. Atlason, Askell Love, Sigurdur Sigurdsson, Vilmundur Gudnason, Lotta M. Ellingsen
Abstract Lesions that appear hyperintense in both Fluid Attenuated Inversion Recovery (FLAIR) and T2-weighted magnetic resonance images (MRIs) of the human brain are common in the brains of the elderly population and may be caused by ischemia or demyelination. Lesions are biomarkers for various neurodegenerative diseases, making accurate quantification of them important for both disease diagnosis and progression. Automatic lesion detection using supervised learning requires manually annotated images, which can often be impractical to acquire. Unsupervised lesion detection, on the other hand, does not require any manual delineation; however, these methods can be challenging to construct due to the variability in lesion load, placement of lesions, and voxel intensities. Here we present a novel approach to address this problem using a convolutional autoencoder, which learns to segment brain lesions as well as the white matter, gray matter, and cerebrospinal fluid by reconstructing FLAIR images as conical combinations of softmax layer outputs generated from the corresponding T1, T2, and FLAIR images. Some of the advantages of this model are that it accurately learns to segment lesions regardless of lesion load, and it can be used to quickly and robustly segment new images that were not in the training set. Comparisons with state-of-the-art segmentation methods evaluated on ground truth manual labels indicate that the proposed method works well for generating accurate lesion segmentations without the need for manual annotations.
Tasks Brain Lesion Segmentation From Mri, Lesion Segmentation
Published 2018-11-23
URL http://arxiv.org/abs/1811.09655v1
PDF http://arxiv.org/pdf/1811.09655v1.pdf
PWC https://paperswithcode.com/paper/unsupervised-brain-lesion-segmentation-from
Repo
Framework

MIDI-VAE: Modeling Dynamics and Instrumentation of Music with Applications to Style Transfer

Title MIDI-VAE: Modeling Dynamics and Instrumentation of Music with Applications to Style Transfer
Authors Gino Brunner, Andres Konrad, Yuyi Wang, Roger Wattenhofer
Abstract We introduce MIDI-VAE, a neural network model based on Variational Autoencoders that is capable of handling polyphonic music with multiple instrument tracks, as well as modeling the dynamics of music by incorporating note durations and velocities. We show that MIDI-VAE can perform style transfer on symbolic music by automatically changing pitches, dynamics and instruments of a music piece from, e.g., a Classical to a Jazz style. We evaluate the efficacy of the style transfer by training separate style validation classifiers. Our model can also interpolate between short pieces of music, produce medleys and create mixtures of entire songs. The interpolations smoothly change pitches, dynamics and instrumentation to create a harmonic bridge between two music pieces. To the best of our knowledge, this work represents the first successful attempt at applying neural style transfer to complete musical compositions.
Tasks Style Transfer
Published 2018-09-20
URL http://arxiv.org/abs/1809.07600v1
PDF http://arxiv.org/pdf/1809.07600v1.pdf
PWC https://paperswithcode.com/paper/midi-vae-modeling-dynamics-and
Repo
Framework

GD-GAN: Generative Adversarial Networks for Trajectory Prediction and Group Detection in Crowds

Title GD-GAN: Generative Adversarial Networks for Trajectory Prediction and Group Detection in Crowds
Authors Tharindu Fernando, Simon Denman, Sridha Sridharan, Clinton Fookes
Abstract This paper presents a novel deep learning framework for human trajectory prediction and detecting social group membership in crowds. We introduce a generative adversarial pipeline which preserves the spatio-temporal structure of the pedestrian’s neighbourhood, enabling us to extract relevant attributes describing their social identity. We formulate the group detection task as an unsupervised learning problem, obviating the need for supervised learning of group memberships via hand labeled databases, allowing us to directly employ the proposed framework in different surveillance settings. We evaluate the proposed trajectory prediction and group detection frameworks on multiple public benchmarks, and for both tasks the proposed method demonstrates its capability to better anticipate human sociological behaviour compared to the existing state-of-the-art methods.
Tasks Group Detection In Crowds, Trajectory Prediction
Published 2018-12-18
URL http://arxiv.org/abs/1812.07667v1
PDF http://arxiv.org/pdf/1812.07667v1.pdf
PWC https://paperswithcode.com/paper/gd-gan-generative-adversarial-networks-for
Repo
Framework

Optimizing Channel Selection for Seizure Detection

Title Optimizing Channel Selection for Seizure Detection
Authors Vinit Shah, Meysam Golmohammadi, Saeedeh Ziyabari, Eva Von Weltin, Iyad Obeid, Joseph Picone
Abstract Interpretation of electroencephalogram (EEG) signals can be complicated by obfuscating artifacts. Artifact detection plays an important role in the observation and analysis of EEG signals. Spatial information contained in the placement of the electrodes can be exploited to accurately detect artifacts. However, when fewer electrodes are used, less spatial information is available, making it harder to detect artifacts. In this study, we investigate the performance of a deep learning algorithm, CNN-LSTM, on several channel configurations. Each configuration was designed to minimize the amount of spatial information lost compared to a standard 22-channel EEG. Systems using a reduced number of channels ranging from 8 to 20 achieved sensitivities between 33% and 37% with false alarms in the range of [38, 50] per 24 hours. False alarms increased dramatically (e.g., over 300 per 24 hours) when the number of channels was further reduced. Baseline performance of a system that used all 22 channels was 39% sensitivity with 23 false alarms. Since the 22-channel system was the only system that included referential channels, the rapid increase in the false alarm rate as the number of channels was reduced underscores the importance of retaining referential channels for artifact reduction. This cautionary result is important because one of the biggest differences between various types of EEGs administered is the type of referential channel used.
Tasks EEG, Seizure Detection
Published 2018-01-03
URL http://arxiv.org/abs/1801.02472v1
PDF http://arxiv.org/pdf/1801.02472v1.pdf
PWC https://paperswithcode.com/paper/optimizing-channel-selection-for-seizure
Repo
Framework

Understanding and correcting pathologies in the training of learned optimizers

Title Understanding and correcting pathologies in the training of learned optimizers
Authors Luke Metz, Niru Maheswaranathan, Jeremy Nixon, C. Daniel Freeman, Jascha Sohl-Dickstein
Abstract Deep learning has shown that learned functions can dramatically outperform hand-designed functions on perceptual tasks. Analogously, this suggests that learned optimizers may similarly outperform current hand-designed optimizers, especially for specific problems. However, learned optimizers are notoriously difficult to train and have yet to demonstrate wall-clock speedups over hand-designed optimizers, and thus are rarely used in practice. Typically, learned optimizers are trained by truncated backpropagation through an unrolled optimization process resulting in gradients that are either strongly biased (for short truncations) or have exploding norm (for long truncations). In this work we propose a training scheme which overcomes both of these difficulties, by dynamically weighting two unbiased gradient estimators for a variational loss on optimizer performance, allowing us to train neural networks to perform optimization of a specific task faster than tuned first-order methods. We demonstrate these results on problems where our learned optimizer trains convolutional networks faster in wall-clock time compared to tuned first-order methods and with an improvement in test loss.
Tasks
Published 2018-10-24
URL https://arxiv.org/abs/1810.10180v5
PDF https://arxiv.org/pdf/1810.10180v5.pdf
PWC https://paperswithcode.com/paper/understanding-and-correcting-pathologies-in
Repo
Framework

Adversarial Feature-Mapping for Speech Enhancement

Title Adversarial Feature-Mapping for Speech Enhancement
Authors Zhong Meng, Jinyu Li, Yifan Gong, Biing-Hwang, Juang
Abstract Feature-mapping with deep neural networks is commonly used for single-channel speech enhancement, in which a feature-mapping network directly transforms the noisy features to the corresponding enhanced ones and is trained to minimize the mean square errors between the enhanced and clean features. In this paper, we propose an adversarial feature-mapping (AFM) method for speech enhancement which advances the feature-mapping approach with adversarial learning. An additional discriminator network is introduced to distinguish the enhanced features from the real clean ones. The two networks are jointly optimized to minimize the feature-mapping loss and simultaneously mini-maximize the discrimination loss. The distribution of the enhanced features is further pushed towards that of the clean features through this adversarial multi-task training. To achieve better performance on ASR task, senone-aware (SA) AFM is further proposed in which an acoustic model network is jointly trained with the feature-mapping and discriminator networks to optimize the senone classification loss in addition to the AFM losses. Evaluated on the CHiME-3 dataset, the proposed AFM achieves 16.95% and 5.27% relative word error rate (WER) improvements over the real noisy data and the feature-mapping baseline respectively and the SA-AFM achieves 9.85% relative WER improvement over the multi-conditional acoustic model.
Tasks Speech Enhancement
Published 2018-09-06
URL http://arxiv.org/abs/1809.02251v2
PDF http://arxiv.org/pdf/1809.02251v2.pdf
PWC https://paperswithcode.com/paper/adversarial-feature-mapping-for-speech
Repo
Framework

ABox Abduction via Forgetting in ALC (Long Version)

Title ABox Abduction via Forgetting in ALC (Long Version)
Authors Warren Del-Pinto, Renate A. Schmidt
Abstract Abductive reasoning generates explanatory hypotheses for new observations using prior knowledge. This paper investigates the use of forgetting, also known as uniform interpolation, to perform ABox abduction in description logic (ALC) ontologies. Non-abducibles are specified by a forgetting signature which can contain concept, but not role, symbols. The resulting hypotheses are semantically minimal and each consist of a set of disjuncts. These disjuncts are each independent explanations, and are not redundant with respect to the background ontology or the other disjuncts, representing a form of hypothesis space. The observations and hypotheses handled by the method can contain both atomic or complex ALC concepts, excluding role assertions, and are not restricted to Horn clauses. Two approaches to redundancy elimination are explored for practical use: full and approximate. Using a prototype implementation, experiments were performed over a corpus of real world ontologies to investigate the practicality of both approaches across several settings.
Tasks
Published 2018-11-13
URL http://arxiv.org/abs/1811.05420v1
PDF http://arxiv.org/pdf/1811.05420v1.pdf
PWC https://paperswithcode.com/paper/abox-abduction-via-forgetting-in-alc-long
Repo
Framework

3D Scene Parsing via Class-Wise Adaptation

Title 3D Scene Parsing via Class-Wise Adaptation
Authors Daichi Ono, Hiroyuki Yabe, Tsutomu Horikawa
Abstract We propose the method that uses only computer graphics datasets to parse the real world 3D scenes. 3D scene parsing based on semantic segmentation is required to implement the categorical interaction in the virtual world. Convolutional Neural Networks (CNNs) have recently shown state-of-theart performance on computer vision tasks including semantic segmentation. However, collecting and annotating a huge amount of data are needed to train CNNs. Especially in the case of semantic segmentation, annotating pixel by pixel takes a significant amount of time and often makes mistakes. In contrast, computer graphics can generate a lot of accurate annotated data and easily scale up by changing camera positions, textures and lights. Despite these advantages, models trained on computer graphics datasets cannot perform well on real data, which is known as the domain shift. To address this issue, we first present that depth modal and synthetic noise are effective to reduce the domain shift. Then, we develop the class-wise adaptation which obtains domain invariant features of CNNs. To reduce the domain shift, we create computer graphics rooms with a lot of props, and provide photo-realistic rendered images.We also demonstrate the application which is combined semantic segmentation with Simultaneous Localization and Mapping (SLAM). Our application performs accurate 3D scene parsing in real-time on an actual room.
Tasks Scene Parsing, Semantic Segmentation, Simultaneous Localization and Mapping
Published 2018-12-10
URL http://arxiv.org/abs/1812.03622v2
PDF http://arxiv.org/pdf/1812.03622v2.pdf
PWC https://paperswithcode.com/paper/3d-scene-parsing-via-class-wise-adaptation
Repo
Framework

Automatic Seismic Salt Interpretation with Deep Convolutional Neural Networks

Title Automatic Seismic Salt Interpretation with Deep Convolutional Neural Networks
Authors Yu Zeng, Kebei Jiang, Jie Chen
Abstract One of the most crucial tasks in seismic reflection imaging is to identify the salt bodies with high precision. Traditionally, this is accomplished by visually picking the salt/sediment boundaries, which requires a great amount of manual work and may introduce systematic bias. With recent progress of deep learning algorithm and growing computational power, a great deal of efforts have been made to replace human effort with machine power in salt body interpretation. Currently, the method of Convolutional neural networks (CNN) is revolutionizing the computer vision field and has been a hot topic in the image analysis. In this paper, the benefits of CNN-based classification are demonstrated by using a state-of-art network structure U-Net, along with the residual learning framework ResNet, to delineate salt body with high precision. Network adjustments, including the Exponential Linear Units (ELU) activation function, the Lov'{a}sz-Softmax loss function, and stratified $K$-fold cross-validation, have been deployed to further improve the prediction accuracy. The preliminary result using SEG Advanced Modeling (SEAM) data shows good agreement between the predicted salt body and manually interpreted salt body, especially in areas with weak reflections. This indicates the great potential of applying CNN for salt-related interpretations.
Tasks
Published 2018-11-24
URL http://arxiv.org/abs/1812.01101v1
PDF http://arxiv.org/pdf/1812.01101v1.pdf
PWC https://paperswithcode.com/paper/automatic-seismic-salt-interpretation-with
Repo
Framework

The Deep Kernelized Autoencoder

Title The Deep Kernelized Autoencoder
Authors Michael Kampffmeyer, Sigurd Løkse, Filippo M. Bianchi, Robert Jenssen, Lorenzo Livi
Abstract Autoencoders learn data representations (codes) in such a way that the input is reproduced at the output of the network. However, it is not always clear what kind of properties of the input data need to be captured by the codes. Kernel machines have experienced great success by operating via inner-products in a theoretically well-defined reproducing kernel Hilbert space, hence capturing topological properties of input data. In this paper, we enhance the autoencoder’s ability to learn effective data representations by aligning inner products between codes with respect to a kernel matrix. By doing so, the proposed kernelized autoencoder allows learning similarity-preserving embeddings of input data, where the notion of similarity is explicitly controlled by the user and encoded in a positive semi-definite kernel matrix. Experiments are performed for evaluating both reconstruction and kernel alignment performance in classification tasks and visualization of high-dimensional data. Additionally, we show that our method is capable to emulate kernel principal component analysis on a denoising task, obtaining competitive results at a much lower computational cost.
Tasks Denoising
Published 2018-07-19
URL http://arxiv.org/abs/1807.07868v2
PDF http://arxiv.org/pdf/1807.07868v2.pdf
PWC https://paperswithcode.com/paper/the-deep-kernelized-autoencoder
Repo
Framework
Title Intertemporal Connections Between Query Suggestions and Search Engine Results for Politics Related Queries
Authors Malte Bonart, Philipp Schaer
Abstract This short paper deals with the combination and comparison of two data sources: Search engine results and query suggestions for 16 terms related to political candidates and parties. The data was collected before the federal election in Germany in September 2017 for a period of two months. The rank biased overlap (RBO) statistic is used to measure the similarity of the top-weighted rankings. For each search term and for both the search results and query auto-completions we study the stability of the rankings over time.
Tasks
Published 2018-12-20
URL http://arxiv.org/abs/1812.08585v2
PDF http://arxiv.org/pdf/1812.08585v2.pdf
PWC https://paperswithcode.com/paper/intertemporal-connections-between-query
Repo
Framework

Volatility in the Issue Attention Economy

Title Volatility in the Issue Attention Economy
Authors Chico Q. Camargo, Scott A. Hale, Peter John, Helen Z. Margetts
Abstract Recent election surprises and regime changes have left the impression that politics has become more fast-moving and unstable. While modern politics does seem more volatile, there is little systematic evidence to support this claim. This paper seeks to address this gap in knowledge by reporting data over the last seventy years using public opinion polls and traditional media data from the UK and Germany. These countries are good cases to study because both have experienced considerable changes in electoral behaviour and have new political parties during the time period studied. We measure volatility in public opinion and in media coverage using approaches from information theory, tracking the change in word-use patterns across over 700,000 articles. Our preliminary analysis suggests an increase in the number of opinion issues over time and a growth in lack of predictability of the media series from the 1970s.
Tasks
Published 2018-08-27
URL http://arxiv.org/abs/1808.09037v1
PDF http://arxiv.org/pdf/1808.09037v1.pdf
PWC https://paperswithcode.com/paper/volatility-in-the-issue-attention-economy
Repo
Framework

Model Adaptation with Synthetic and Real Data for Semantic Dense Foggy Scene Understanding

Title Model Adaptation with Synthetic and Real Data for Semantic Dense Foggy Scene Understanding
Authors Christos Sakaridis, Dengxin Dai, Simon Hecker, Luc Van Gool
Abstract This work addresses the problem of semantic scene understanding under dense fog. Although considerable progress has been made in semantic scene understanding, it is mainly related to clear-weather scenes. Extending recognition methods to adverse weather conditions such as fog is crucial for outdoor applications. In this paper, we propose a novel method, named Curriculum Model Adaptation (CMAda), which gradually adapts a semantic segmentation model from light synthetic fog to dense real fog in multiple steps, using both synthetic and real foggy data. In addition, we present three other main stand-alone contributions: 1) a novel method to add synthetic fog to real, clear-weather scenes using semantic input; 2) a new fog density estimator; 3) the Foggy Zurich dataset comprising $3808$ real foggy images, with pixel-level semantic annotations for $16$ images with dense fog. Our experiments show that 1) our fog simulation slightly outperforms a state-of-the-art competing simulation with respect to the task of semantic foggy scene understanding (SFSU); 2) CMAda improves the performance of state-of-the-art models for SFSU significantly by leveraging unlabeled real foggy data. The datasets and code are publicly available.
Tasks Scene Understanding, Semantic Segmentation
Published 2018-08-03
URL http://arxiv.org/abs/1808.01265v1
PDF http://arxiv.org/pdf/1808.01265v1.pdf
PWC https://paperswithcode.com/paper/model-adaptation-with-synthetic-and-real-data
Repo
Framework

Curse of Heterogeneity: Computational Barriers in Sparse Mixture Models and Phase Retrieval

Title Curse of Heterogeneity: Computational Barriers in Sparse Mixture Models and Phase Retrieval
Authors Jianqing Fan, Han Liu, Zhaoran Wang, Zhuoran Yang
Abstract We study the fundamental tradeoffs between statistical accuracy and computational tractability in the analysis of high dimensional heterogeneous data. As examples, we study sparse Gaussian mixture model, mixture of sparse linear regressions, and sparse phase retrieval model. For these models, we exploit an oracle-based computational model to establish conjecture-free computationally feasible minimax lower bounds, which quantify the minimum signal strength required for the existence of any algorithm that is both computationally tractable and statistically accurate. Our analysis shows that there exist significant gaps between computationally feasible minimax risks and classical ones. These gaps quantify the statistical price we must pay to achieve computational tractability in the presence of data heterogeneity. Our results cover the problems of detection, estimation, support recovery, and clustering, and moreover, resolve several conjectures of Azizyan et al. (2013, 2015); Verzelen and Arias-Castro (2017); Cai et al. (2016). Interestingly, our results reveal a new but counter-intuitive phenomenon in heterogeneous data analysis that more data might lead to less computation complexity.
Tasks
Published 2018-08-21
URL http://arxiv.org/abs/1808.06996v1
PDF http://arxiv.org/pdf/1808.06996v1.pdf
PWC https://paperswithcode.com/paper/curse-of-heterogeneity-computational-barriers
Repo
Framework

Premise selection with neural networks and distributed representation of features

Title Premise selection with neural networks and distributed representation of features
Authors Andrzej Stanisław Kucik, Konstantin Korovin
Abstract We present the problem of selecting relevant premises for a proof of a given statement. When stated as a binary classification task for pairs (conjecture, axiom), it can be efficiently solved using artificial neural networks. The key difference between our advance to solve this problem and previous approaches is the use of just functional signatures of premises. To further improve the performance of the model, we use dimensionality reduction technique, to replace long and sparse signature vectors with their compact and dense embedded versions. These are obtained by firstly defining the concept of a context for each functor symbol, and then training a simple neural network to predict the distribution of other functor symbols in the context of this functor. After training the network, the output of its hidden layer is used to construct a lower dimensional embedding of a functional signature (for each premise) with a distributed representation of features. This allows us to use 512-dimensional embeddings for conjecture-axiom pairs, containing enough information about the original statements to reach the accuracy of 76.45% in premise selection task, only with simple two-layer densely connected neural networks.
Tasks Dimensionality Reduction
Published 2018-07-26
URL http://arxiv.org/abs/1807.10268v1
PDF http://arxiv.org/pdf/1807.10268v1.pdf
PWC https://paperswithcode.com/paper/premise-selection-with-neural-networks-and
Repo
Framework
comments powered by Disqus