October 16, 2019

3226 words 16 mins read

Paper Group ANR 1131

Unsupervised brain lesion segmentation from MRI using a convolutional autoencoder. MIDI-VAE: Modeling Dynamics and Instrumentation of Music with Applications to Style Transfer. GD-GAN: Generative Adversarial Networks for Trajectory Prediction and Group Detection in Crowds. Optimizing Channel Selection for Seizure Detection. Understanding and correc …

Unsupervised brain lesion segmentation from MRI using a convolutional autoencoder


Title	Unsupervised brain lesion segmentation from MRI using a convolutional autoencoder
Authors	Hans E. Atlason, Askell Love, Sigurdur Sigurdsson, Vilmundur Gudnason, Lotta M. Ellingsen
Abstract	Lesions that appear hyperintense in both Fluid Attenuated Inversion Recovery (FLAIR) and T2-weighted magnetic resonance images (MRIs) of the human brain are common in the brains of the elderly population and may be caused by ischemia or demyelination. Lesions are biomarkers for various neurodegenerative diseases, making accurate quantification of them important for both disease diagnosis and progression. Automatic lesion detection using supervised learning requires manually annotated images, which can often be impractical to acquire. Unsupervised lesion detection, on the other hand, does not require any manual delineation; however, these methods can be challenging to construct due to the variability in lesion load, placement of lesions, and voxel intensities. Here we present a novel approach to address this problem using a convolutional autoencoder, which learns to segment brain lesions as well as the white matter, gray matter, and cerebrospinal fluid by reconstructing FLAIR images as conical combinations of softmax layer outputs generated from the corresponding T1, T2, and FLAIR images. Some of the advantages of this model are that it accurately learns to segment lesions regardless of lesion load, and it can be used to quickly and robustly segment new images that were not in the training set. Comparisons with state-of-the-art segmentation methods evaluated on ground truth manual labels indicate that the proposed method works well for generating accurate lesion segmentations without the need for manual annotations.
Tasks	Brain Lesion Segmentation From Mri, Lesion Segmentation
Published	2018-11-23
URL	http://arxiv.org/abs/1811.09655v1
PDF	http://arxiv.org/pdf/1811.09655v1.pdf
PWC	https://paperswithcode.com/paper/unsupervised-brain-lesion-segmentation-from
Repo
Framework

MIDI-VAE: Modeling Dynamics and Instrumentation of Music with Applications to Style Transfer


Title	MIDI-VAE: Modeling Dynamics and Instrumentation of Music with Applications to Style Transfer
Authors	Gino Brunner, Andres Konrad, Yuyi Wang, Roger Wattenhofer
Abstract	We introduce MIDI-VAE, a neural network model based on Variational Autoencoders that is capable of handling polyphonic music with multiple instrument tracks, as well as modeling the dynamics of music by incorporating note durations and velocities. We show that MIDI-VAE can perform style transfer on symbolic music by automatically changing pitches, dynamics and instruments of a music piece from, e.g., a Classical to a Jazz style. We evaluate the efficacy of the style transfer by training separate style validation classifiers. Our model can also interpolate between short pieces of music, produce medleys and create mixtures of entire songs. The interpolations smoothly change pitches, dynamics and instrumentation to create a harmonic bridge between two music pieces. To the best of our knowledge, this work represents the first successful attempt at applying neural style transfer to complete musical compositions.
Tasks	Style Transfer
Published	2018-09-20
URL	http://arxiv.org/abs/1809.07600v1
PDF	http://arxiv.org/pdf/1809.07600v1.pdf
PWC	https://paperswithcode.com/paper/midi-vae-modeling-dynamics-and
Repo
Framework

GD-GAN: Generative Adversarial Networks for Trajectory Prediction and Group Detection in Crowds


Title	GD-GAN: Generative Adversarial Networks for Trajectory Prediction and Group Detection in Crowds
Authors	Tharindu Fernando, Simon Denman, Sridha Sridharan, Clinton Fookes
Abstract	This paper presents a novel deep learning framework for human trajectory prediction and detecting social group membership in crowds. We introduce a generative adversarial pipeline which preserves the spatio-temporal structure of the pedestrian’s neighbourhood, enabling us to extract relevant attributes describing their social identity. We formulate the group detection task as an unsupervised learning problem, obviating the need for supervised learning of group memberships via hand labeled databases, allowing us to directly employ the proposed framework in different surveillance settings. We evaluate the proposed trajectory prediction and group detection frameworks on multiple public benchmarks, and for both tasks the proposed method demonstrates its capability to better anticipate human sociological behaviour compared to the existing state-of-the-art methods.
Tasks	Group Detection In Crowds, Trajectory Prediction
Published	2018-12-18
URL	http://arxiv.org/abs/1812.07667v1
PDF	http://arxiv.org/pdf/1812.07667v1.pdf
PWC	https://paperswithcode.com/paper/gd-gan-generative-adversarial-networks-for
Repo
Framework

Optimizing Channel Selection for Seizure Detection


Title	Optimizing Channel Selection for Seizure Detection
Authors	Vinit Shah, Meysam Golmohammadi, Saeedeh Ziyabari, Eva Von Weltin, Iyad Obeid, Joseph Picone
Abstract	Interpretation of electroencephalogram (EEG) signals can be complicated by obfuscating artifacts. Artifact detection plays an important role in the observation and analysis of EEG signals. Spatial information contained in the placement of the electrodes can be exploited to accurately detect artifacts. However, when fewer electrodes are used, less spatial information is available, making it harder to detect artifacts. In this study, we investigate the performance of a deep learning algorithm, CNN-LSTM, on several channel configurations. Each configuration was designed to minimize the amount of spatial information lost compared to a standard 22-channel EEG. Systems using a reduced number of channels ranging from 8 to 20 achieved sensitivities between 33% and 37% with false alarms in the range of [38, 50] per 24 hours. False alarms increased dramatically (e.g., over 300 per 24 hours) when the number of channels was further reduced. Baseline performance of a system that used all 22 channels was 39% sensitivity with 23 false alarms. Since the 22-channel system was the only system that included referential channels, the rapid increase in the false alarm rate as the number of channels was reduced underscores the importance of retaining referential channels for artifact reduction. This cautionary result is important because one of the biggest differences between various types of EEGs administered is the type of referential channel used.
Tasks	EEG, Seizure Detection
Published	2018-01-03
URL	http://arxiv.org/abs/1801.02472v1
PDF	http://arxiv.org/pdf/1801.02472v1.pdf
PWC	https://paperswithcode.com/paper/optimizing-channel-selection-for-seizure
Repo
Framework

Understanding and correcting pathologies in the training of learned optimizers


Title	Understanding and correcting pathologies in the training of learned optimizers
Authors	Luke Metz, Niru Maheswaranathan, Jeremy Nixon, C. Daniel Freeman, Jascha Sohl-Dickstein
Abstract	Deep learning has shown that learned functions can dramatically outperform hand-designed functions on perceptual tasks. Analogously, this suggests that learned optimizers may similarly outperform current hand-designed optimizers, especially for specific problems. However, learned optimizers are notoriously difficult to train and have yet to demonstrate wall-clock speedups over hand-designed optimizers, and thus are rarely used in practice. Typically, learned optimizers are trained by truncated backpropagation through an unrolled optimization process resulting in gradients that are either strongly biased (for short truncations) or have exploding norm (for long truncations). In this work we propose a training scheme which overcomes both of these difficulties, by dynamically weighting two unbiased gradient estimators for a variational loss on optimizer performance, allowing us to train neural networks to perform optimization of a specific task faster than tuned first-order methods. We demonstrate these results on problems where our learned optimizer trains convolutional networks faster in wall-clock time compared to tuned first-order methods and with an improvement in test loss.
Tasks
Published	2018-10-24
URL	https://arxiv.org/abs/1810.10180v5
PDF	https://arxiv.org/pdf/1810.10180v5.pdf
PWC	https://paperswithcode.com/paper/understanding-and-correcting-pathologies-in
Repo
Framework

Adversarial Feature-Mapping for Speech Enhancement


Title	Adversarial Feature-Mapping for Speech Enhancement
Authors	Zhong Meng, Jinyu Li, Yifan Gong, Biing-Hwang, Juang
Abstract	Feature-mapping with deep neural networks is commonly used for single-channel speech enhancement, in which a feature-mapping network directly transforms the noisy features to the corresponding enhanced ones and is trained to minimize the mean square errors between the enhanced and clean features. In this paper, we propose an adversarial feature-mapping (AFM) method for speech enhancement which advances the feature-mapping approach with adversarial learning. An additional discriminator network is introduced to distinguish the enhanced features from the real clean ones. The two networks are jointly optimized to minimize the feature-mapping loss and simultaneously mini-maximize the discrimination loss. The distribution of the enhanced features is further pushed towards that of the clean features through this adversarial multi-task training. To achieve better performance on ASR task, senone-aware (SA) AFM is further proposed in which an acoustic model network is jointly trained with the feature-mapping and discriminator networks to optimize the senone classification loss in addition to the AFM losses. Evaluated on the CHiME-3 dataset, the proposed AFM achieves 16.95% and 5.27% relative word error rate (WER) improvements over the real noisy data and the feature-mapping baseline respectively and the SA-AFM achieves 9.85% relative WER improvement over the multi-conditional acoustic model.
Tasks	Speech Enhancement
Published	2018-09-06
URL	http://arxiv.org/abs/1809.02251v2
PDF	http://arxiv.org/pdf/1809.02251v2.pdf
PWC	https://paperswithcode.com/paper/adversarial-feature-mapping-for-speech
Repo
Framework

ABox Abduction via Forgetting in ALC (Long Version)


Title	ABox Abduction via Forgetting in ALC (Long Version)
Authors	Warren Del-Pinto, Renate A. Schmidt
Abstract	Abductive reasoning generates explanatory hypotheses for new observations using prior knowledge. This paper investigates the use of forgetting, also known as uniform interpolation, to perform ABox abduction in description logic (ALC) ontologies. Non-abducibles are specified by a forgetting signature which can contain concept, but not role, symbols. The resulting hypotheses are semantically minimal and each consist of a set of disjuncts. These disjuncts are each independent explanations, and are not redundant with respect to the background ontology or the other disjuncts, representing a form of hypothesis space. The observations and hypotheses handled by the method can contain both atomic or complex ALC concepts, excluding role assertions, and are not restricted to Horn clauses. Two approaches to redundancy elimination are explored for practical use: full and approximate. Using a prototype implementation, experiments were performed over a corpus of real world ontologies to investigate the practicality of both approaches across several settings.
Tasks
Published	2018-11-13
URL	http://arxiv.org/abs/1811.05420v1
PDF	http://arxiv.org/pdf/1811.05420v1.pdf
PWC	https://paperswithcode.com/paper/abox-abduction-via-forgetting-in-alc-long
Repo
Framework

3D Scene Parsing via Class-Wise Adaptation


Title	3D Scene Parsing via Class-Wise Adaptation
Authors	Daichi Ono, Hiroyuki Yabe, Tsutomu Horikawa
Abstract	We propose the method that uses only computer graphics datasets to parse the real world 3D scenes. 3D scene parsing based on semantic segmentation is required to implement the categorical interaction in the virtual world. Convolutional Neural Networks (CNNs) have recently shown state-of-theart performance on computer vision tasks including semantic segmentation. However, collecting and annotating a huge amount of data are needed to train CNNs. Especially in the case of semantic segmentation, annotating pixel by pixel takes a significant amount of time and often makes mistakes. In contrast, computer graphics can generate a lot of accurate annotated data and easily scale up by changing camera positions, textures and lights. Despite these advantages, models trained on computer graphics datasets cannot perform well on real data, which is known as the domain shift. To address this issue, we first present that depth modal and synthetic noise are effective to reduce the domain shift. Then, we develop the class-wise adaptation which obtains domain invariant features of CNNs. To reduce the domain shift, we create computer graphics rooms with a lot of props, and provide photo-realistic rendered images.We also demonstrate the application which is combined semantic segmentation with Simultaneous Localization and Mapping (SLAM). Our application performs accurate 3D scene parsing in real-time on an actual room.
Tasks	Scene Parsing, Semantic Segmentation, Simultaneous Localization and Mapping
Published	2018-12-10
URL	http://arxiv.org/abs/1812.03622v2
PDF	http://arxiv.org/pdf/1812.03622v2.pdf
PWC	https://paperswithcode.com/paper/3d-scene-parsing-via-class-wise-adaptation
Repo
Framework

Automatic Seismic Salt Interpretation with Deep Convolutional Neural Networks


Title	Automatic Seismic Salt Interpretation with Deep Convolutional Neural Networks
Authors	Yu Zeng, Kebei Jiang, Jie Chen
Abstract	One of the most crucial tasks in seismic reflection imaging is to identify the salt bodies with high precision. Traditionally, this is accomplished by visually picking the salt/sediment boundaries, which requires a great amount of manual work and may introduce systematic bias. With recent progress of deep learning algorithm and growing computational power, a great deal of efforts have been made to replace human effort with machine power in salt body interpretation. Currently, the method of Convolutional neural networks (CNN) is revolutionizing the computer vision field and has been a hot topic in the image analysis. In this paper, the benefits of CNN-based classification are demonstrated by using a state-of-art network structure U-Net, along with the residual learning framework ResNet, to delineate salt body with high precision. Network adjustments, including the Exponential Linear Units (ELU) activation function, the Lov'{a}sz-Softmax loss function, and stratified $K$-fold cross-validation, have been deployed to further improve the prediction accuracy. The preliminary result using SEG Advanced Modeling (SEAM) data shows good agreement between the predicted salt body and manually interpreted salt body, especially in areas with weak reflections. This indicates the great potential of applying CNN for salt-related interpretations.
Tasks
Published	2018-11-24
URL	http://arxiv.org/abs/1812.01101v1
PDF	http://arxiv.org/pdf/1812.01101v1.pdf
PWC	https://paperswithcode.com/paper/automatic-seismic-salt-interpretation-with
Repo
Framework

The Deep Kernelized Autoencoder


Title	The Deep Kernelized Autoencoder
Authors	Michael Kampffmeyer, Sigurd Løkse, Filippo M. Bianchi, Robert Jenssen, Lorenzo Livi
Abstract	Autoencoders learn data representations (codes) in such a way that the input is reproduced at the output of the network. However, it is not always clear what kind of properties of the input data need to be captured by the codes. Kernel machines have experienced great success by operating via inner-products in a theoretically well-defined reproducing kernel Hilbert space, hence capturing topological properties of input data. In this paper, we enhance the autoencoder’s ability to learn effective data representations by aligning inner products between codes with respect to a kernel matrix. By doing so, the proposed kernelized autoencoder allows learning similarity-preserving embeddings of input data, where the notion of similarity is explicitly controlled by the user and encoded in a positive semi-definite kernel matrix. Experiments are performed for evaluating both reconstruction and kernel alignment performance in classification tasks and visualization of high-dimensional data. Additionally, we show that our method is capable to emulate kernel principal component analysis on a denoising task, obtaining competitive results at a much lower computational cost.
Tasks	Denoising
Published	2018-07-19
URL	http://arxiv.org/abs/1807.07868v2
PDF	http://arxiv.org/pdf/1807.07868v2.pdf
PWC	https://paperswithcode.com/paper/the-deep-kernelized-autoencoder
Repo
Framework


Title	Intertemporal Connections Between Query Suggestions and Search Engine Results for Politics Related Queries
Authors	Malte Bonart, Philipp Schaer
Abstract	This short paper deals with the combination and comparison of two data sources: Search engine results and query suggestions for 16 terms related to political candidates and parties. The data was collected before the federal election in Germany in September 2017 for a period of two months. The rank biased overlap (RBO) statistic is used to measure the similarity of the top-weighted rankings. For each search term and for both the search results and query auto-completions we study the stability of the rankings over time.
Tasks
Published	2018-12-20
URL	http://arxiv.org/abs/1812.08585v2
PDF	http://arxiv.org/pdf/1812.08585v2.pdf
PWC	https://paperswithcode.com/paper/intertemporal-connections-between-query
Repo
Framework

Volatility in the Issue Attention Economy


Title	Volatility in the Issue Attention Economy
Authors	Chico Q. Camargo, Scott A. Hale, Peter John, Helen Z. Margetts
Abstract	Recent election surprises and regime changes have left the impression that politics has become more fast-moving and unstable. While modern politics does seem more volatile, there is little systematic evidence to support this claim. This paper seeks to address this gap in knowledge by reporting data over the last seventy years using public opinion polls and traditional media data from the UK and Germany. These countries are good cases to study because both have experienced considerable changes in electoral behaviour and have new political parties during the time period studied. We measure volatility in public opinion and in media coverage using approaches from information theory, tracking the change in word-use patterns across over 700,000 articles. Our preliminary analysis suggests an increase in the number of opinion issues over time and a growth in lack of predictability of the media series from the 1970s.
Tasks
Published	2018-08-27
URL	http://arxiv.org/abs/1808.09037v1
PDF	http://arxiv.org/pdf/1808.09037v1.pdf
PWC	https://paperswithcode.com/paper/volatility-in-the-issue-attention-economy
Repo
Framework

Model Adaptation with Synthetic and Real Data for Semantic Dense Foggy Scene Understanding


Title	Model Adaptation with Synthetic and Real Data for Semantic Dense Foggy Scene Understanding
Authors	Christos Sakaridis, Dengxin Dai, Simon Hecker, Luc Van Gool
Abstract	This work addresses the problem of semantic scene understanding under dense fog. Although considerable progress has been made in semantic scene understanding, it is mainly related to clear-weather scenes. Extending recognition methods to adverse weather conditions such as fog is crucial for outdoor applications. In this paper, we propose a novel method, named Curriculum Model Adaptation (CMAda), which gradually adapts a semantic segmentation model from light synthetic fog to dense real fog in multiple steps, using both synthetic and real foggy data. In addition, we present three other main stand-alone contributions: 1) a novel method to add synthetic fog to real, clear-weather scenes using semantic input; 2) a new fog density estimator; 3) the Foggy Zurich dataset comprising $3808$ real foggy images, with pixel-level semantic annotations for $16$ images with dense fog. Our experiments show that 1) our fog simulation slightly outperforms a state-of-the-art competing simulation with respect to the task of semantic foggy scene understanding (SFSU); 2) CMAda improves the performance of state-of-the-art models for SFSU significantly by leveraging unlabeled real foggy data. The datasets and code are publicly available.
Tasks	Scene Understanding, Semantic Segmentation
Published	2018-08-03
URL	http://arxiv.org/abs/1808.01265v1
PDF	http://arxiv.org/pdf/1808.01265v1.pdf
PWC	https://paperswithcode.com/paper/model-adaptation-with-synthetic-and-real-data
Repo
Framework

Curse of Heterogeneity: Computational Barriers in Sparse Mixture Models and Phase Retrieval


Title	Curse of Heterogeneity: Computational Barriers in Sparse Mixture Models and Phase Retrieval
Authors	Jianqing Fan, Han Liu, Zhaoran Wang, Zhuoran Yang
Abstract	We study the fundamental tradeoffs between statistical accuracy and computational tractability in the analysis of high dimensional heterogeneous data. As examples, we study sparse Gaussian mixture model, mixture of sparse linear regressions, and sparse phase retrieval model. For these models, we exploit an oracle-based computational model to establish conjecture-free computationally feasible minimax lower bounds, which quantify the minimum signal strength required for the existence of any algorithm that is both computationally tractable and statistically accurate. Our analysis shows that there exist significant gaps between computationally feasible minimax risks and classical ones. These gaps quantify the statistical price we must pay to achieve computational tractability in the presence of data heterogeneity. Our results cover the problems of detection, estimation, support recovery, and clustering, and moreover, resolve several conjectures of Azizyan et al. (2013, 2015); Verzelen and Arias-Castro (2017); Cai et al. (2016). Interestingly, our results reveal a new but counter-intuitive phenomenon in heterogeneous data analysis that more data might lead to less computation complexity.
Tasks
Published	2018-08-21
URL	http://arxiv.org/abs/1808.06996v1
PDF	http://arxiv.org/pdf/1808.06996v1.pdf
PWC	https://paperswithcode.com/paper/curse-of-heterogeneity-computational-barriers
Repo
Framework

Premise selection with neural networks and distributed representation of features


Title	Premise selection with neural networks and distributed representation of features
Authors	Andrzej Stanisław Kucik, Konstantin Korovin
Abstract	We present the problem of selecting relevant premises for a proof of a given statement. When stated as a binary classification task for pairs (conjecture, axiom), it can be efficiently solved using artificial neural networks. The key difference between our advance to solve this problem and previous approaches is the use of just functional signatures of premises. To further improve the performance of the model, we use dimensionality reduction technique, to replace long and sparse signature vectors with their compact and dense embedded versions. These are obtained by firstly defining the concept of a context for each functor symbol, and then training a simple neural network to predict the distribution of other functor symbols in the context of this functor. After training the network, the output of its hidden layer is used to construct a lower dimensional embedding of a functional signature (for each premise) with a distributed representation of features. This allows us to use 512-dimensional embeddings for conjecture-axiom pairs, containing enough information about the original statements to reach the accuracy of 76.45% in premise selection task, only with simple two-layer densely connected neural networks.
Tasks	Dimensionality Reduction
Published	2018-07-26
URL	http://arxiv.org/abs/1807.10268v1
PDF	http://arxiv.org/pdf/1807.10268v1.pdf
PWC	https://paperswithcode.com/paper/premise-selection-with-neural-networks-and
Repo
Framework