Paper Group ANR 1131
Unsupervised brain lesion segmentation from MRI using a convolutional autoencoder. MIDI-VAE: Modeling Dynamics and Instrumentation of Music with Applications to Style Transfer. GD-GAN: Generative Adversarial Networks for Trajectory Prediction and Group Detection in Crowds. Optimizing Channel Selection for Seizure Detection. Understanding and correc …
Unsupervised brain lesion segmentation from MRI using a convolutional autoencoder
Title | Unsupervised brain lesion segmentation from MRI using a convolutional autoencoder |
Authors | Hans E. Atlason, Askell Love, Sigurdur Sigurdsson, Vilmundur Gudnason, Lotta M. Ellingsen |
Abstract | Lesions that appear hyperintense in both Fluid Attenuated Inversion Recovery (FLAIR) and T2-weighted magnetic resonance images (MRIs) of the human brain are common in the brains of the elderly population and may be caused by ischemia or demyelination. Lesions are biomarkers for various neurodegenerative diseases, making accurate quantification of them important for both disease diagnosis and progression. Automatic lesion detection using supervised learning requires manually annotated images, which can often be impractical to acquire. Unsupervised lesion detection, on the other hand, does not require any manual delineation; however, these methods can be challenging to construct due to the variability in lesion load, placement of lesions, and voxel intensities. Here we present a novel approach to address this problem using a convolutional autoencoder, which learns to segment brain lesions as well as the white matter, gray matter, and cerebrospinal fluid by reconstructing FLAIR images as conical combinations of softmax layer outputs generated from the corresponding T1, T2, and FLAIR images. Some of the advantages of this model are that it accurately learns to segment lesions regardless of lesion load, and it can be used to quickly and robustly segment new images that were not in the training set. Comparisons with state-of-the-art segmentation methods evaluated on ground truth manual labels indicate that the proposed method works well for generating accurate lesion segmentations without the need for manual annotations. |
Tasks | Brain Lesion Segmentation From Mri, Lesion Segmentation |
Published | 2018-11-23 |
URL | http://arxiv.org/abs/1811.09655v1 |
http://arxiv.org/pdf/1811.09655v1.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-brain-lesion-segmentation-from |
Repo | |
Framework | |
MIDI-VAE: Modeling Dynamics and Instrumentation of Music with Applications to Style Transfer
Title | MIDI-VAE: Modeling Dynamics and Instrumentation of Music with Applications to Style Transfer |
Authors | Gino Brunner, Andres Konrad, Yuyi Wang, Roger Wattenhofer |
Abstract | We introduce MIDI-VAE, a neural network model based on Variational Autoencoders that is capable of handling polyphonic music with multiple instrument tracks, as well as modeling the dynamics of music by incorporating note durations and velocities. We show that MIDI-VAE can perform style transfer on symbolic music by automatically changing pitches, dynamics and instruments of a music piece from, e.g., a Classical to a Jazz style. We evaluate the efficacy of the style transfer by training separate style validation classifiers. Our model can also interpolate between short pieces of music, produce medleys and create mixtures of entire songs. The interpolations smoothly change pitches, dynamics and instrumentation to create a harmonic bridge between two music pieces. To the best of our knowledge, this work represents the first successful attempt at applying neural style transfer to complete musical compositions. |
Tasks | Style Transfer |
Published | 2018-09-20 |
URL | http://arxiv.org/abs/1809.07600v1 |
http://arxiv.org/pdf/1809.07600v1.pdf | |
PWC | https://paperswithcode.com/paper/midi-vae-modeling-dynamics-and |
Repo | |
Framework | |
GD-GAN: Generative Adversarial Networks for Trajectory Prediction and Group Detection in Crowds
Title | GD-GAN: Generative Adversarial Networks for Trajectory Prediction and Group Detection in Crowds |
Authors | Tharindu Fernando, Simon Denman, Sridha Sridharan, Clinton Fookes |
Abstract | This paper presents a novel deep learning framework for human trajectory prediction and detecting social group membership in crowds. We introduce a generative adversarial pipeline which preserves the spatio-temporal structure of the pedestrian’s neighbourhood, enabling us to extract relevant attributes describing their social identity. We formulate the group detection task as an unsupervised learning problem, obviating the need for supervised learning of group memberships via hand labeled databases, allowing us to directly employ the proposed framework in different surveillance settings. We evaluate the proposed trajectory prediction and group detection frameworks on multiple public benchmarks, and for both tasks the proposed method demonstrates its capability to better anticipate human sociological behaviour compared to the existing state-of-the-art methods. |
Tasks | Group Detection In Crowds, Trajectory Prediction |
Published | 2018-12-18 |
URL | http://arxiv.org/abs/1812.07667v1 |
http://arxiv.org/pdf/1812.07667v1.pdf | |
PWC | https://paperswithcode.com/paper/gd-gan-generative-adversarial-networks-for |
Repo | |
Framework | |
Optimizing Channel Selection for Seizure Detection
Title | Optimizing Channel Selection for Seizure Detection |
Authors | Vinit Shah, Meysam Golmohammadi, Saeedeh Ziyabari, Eva Von Weltin, Iyad Obeid, Joseph Picone |
Abstract | Interpretation of electroencephalogram (EEG) signals can be complicated by obfuscating artifacts. Artifact detection plays an important role in the observation and analysis of EEG signals. Spatial information contained in the placement of the electrodes can be exploited to accurately detect artifacts. However, when fewer electrodes are used, less spatial information is available, making it harder to detect artifacts. In this study, we investigate the performance of a deep learning algorithm, CNN-LSTM, on several channel configurations. Each configuration was designed to minimize the amount of spatial information lost compared to a standard 22-channel EEG. Systems using a reduced number of channels ranging from 8 to 20 achieved sensitivities between 33% and 37% with false alarms in the range of [38, 50] per 24 hours. False alarms increased dramatically (e.g., over 300 per 24 hours) when the number of channels was further reduced. Baseline performance of a system that used all 22 channels was 39% sensitivity with 23 false alarms. Since the 22-channel system was the only system that included referential channels, the rapid increase in the false alarm rate as the number of channels was reduced underscores the importance of retaining referential channels for artifact reduction. This cautionary result is important because one of the biggest differences between various types of EEGs administered is the type of referential channel used. |
Tasks | EEG, Seizure Detection |
Published | 2018-01-03 |
URL | http://arxiv.org/abs/1801.02472v1 |
http://arxiv.org/pdf/1801.02472v1.pdf | |
PWC | https://paperswithcode.com/paper/optimizing-channel-selection-for-seizure |
Repo | |
Framework | |
Understanding and correcting pathologies in the training of learned optimizers
Title | Understanding and correcting pathologies in the training of learned optimizers |
Authors | Luke Metz, Niru Maheswaranathan, Jeremy Nixon, C. Daniel Freeman, Jascha Sohl-Dickstein |
Abstract | Deep learning has shown that learned functions can dramatically outperform hand-designed functions on perceptual tasks. Analogously, this suggests that learned optimizers may similarly outperform current hand-designed optimizers, especially for specific problems. However, learned optimizers are notoriously difficult to train and have yet to demonstrate wall-clock speedups over hand-designed optimizers, and thus are rarely used in practice. Typically, learned optimizers are trained by truncated backpropagation through an unrolled optimization process resulting in gradients that are either strongly biased (for short truncations) or have exploding norm (for long truncations). In this work we propose a training scheme which overcomes both of these difficulties, by dynamically weighting two unbiased gradient estimators for a variational loss on optimizer performance, allowing us to train neural networks to perform optimization of a specific task faster than tuned first-order methods. We demonstrate these results on problems where our learned optimizer trains convolutional networks faster in wall-clock time compared to tuned first-order methods and with an improvement in test loss. |
Tasks | |
Published | 2018-10-24 |
URL | https://arxiv.org/abs/1810.10180v5 |
https://arxiv.org/pdf/1810.10180v5.pdf | |
PWC | https://paperswithcode.com/paper/understanding-and-correcting-pathologies-in |
Repo | |
Framework | |
Adversarial Feature-Mapping for Speech Enhancement
Title | Adversarial Feature-Mapping for Speech Enhancement |
Authors | Zhong Meng, Jinyu Li, Yifan Gong, Biing-Hwang, Juang |
Abstract | Feature-mapping with deep neural networks is commonly used for single-channel speech enhancement, in which a feature-mapping network directly transforms the noisy features to the corresponding enhanced ones and is trained to minimize the mean square errors between the enhanced and clean features. In this paper, we propose an adversarial feature-mapping (AFM) method for speech enhancement which advances the feature-mapping approach with adversarial learning. An additional discriminator network is introduced to distinguish the enhanced features from the real clean ones. The two networks are jointly optimized to minimize the feature-mapping loss and simultaneously mini-maximize the discrimination loss. The distribution of the enhanced features is further pushed towards that of the clean features through this adversarial multi-task training. To achieve better performance on ASR task, senone-aware (SA) AFM is further proposed in which an acoustic model network is jointly trained with the feature-mapping and discriminator networks to optimize the senone classification loss in addition to the AFM losses. Evaluated on the CHiME-3 dataset, the proposed AFM achieves 16.95% and 5.27% relative word error rate (WER) improvements over the real noisy data and the feature-mapping baseline respectively and the SA-AFM achieves 9.85% relative WER improvement over the multi-conditional acoustic model. |
Tasks | Speech Enhancement |
Published | 2018-09-06 |
URL | http://arxiv.org/abs/1809.02251v2 |
http://arxiv.org/pdf/1809.02251v2.pdf | |
PWC | https://paperswithcode.com/paper/adversarial-feature-mapping-for-speech |
Repo | |
Framework | |
ABox Abduction via Forgetting in ALC (Long Version)
Title | ABox Abduction via Forgetting in ALC (Long Version) |
Authors | Warren Del-Pinto, Renate A. Schmidt |
Abstract | Abductive reasoning generates explanatory hypotheses for new observations using prior knowledge. This paper investigates the use of forgetting, also known as uniform interpolation, to perform ABox abduction in description logic (ALC) ontologies. Non-abducibles are specified by a forgetting signature which can contain concept, but not role, symbols. The resulting hypotheses are semantically minimal and each consist of a set of disjuncts. These disjuncts are each independent explanations, and are not redundant with respect to the background ontology or the other disjuncts, representing a form of hypothesis space. The observations and hypotheses handled by the method can contain both atomic or complex ALC concepts, excluding role assertions, and are not restricted to Horn clauses. Two approaches to redundancy elimination are explored for practical use: full and approximate. Using a prototype implementation, experiments were performed over a corpus of real world ontologies to investigate the practicality of both approaches across several settings. |
Tasks | |
Published | 2018-11-13 |
URL | http://arxiv.org/abs/1811.05420v1 |
http://arxiv.org/pdf/1811.05420v1.pdf | |
PWC | https://paperswithcode.com/paper/abox-abduction-via-forgetting-in-alc-long |
Repo | |
Framework | |
3D Scene Parsing via Class-Wise Adaptation
Title | 3D Scene Parsing via Class-Wise Adaptation |
Authors | Daichi Ono, Hiroyuki Yabe, Tsutomu Horikawa |
Abstract | We propose the method that uses only computer graphics datasets to parse the real world 3D scenes. 3D scene parsing based on semantic segmentation is required to implement the categorical interaction in the virtual world. Convolutional Neural Networks (CNNs) have recently shown state-of-theart performance on computer vision tasks including semantic segmentation. However, collecting and annotating a huge amount of data are needed to train CNNs. Especially in the case of semantic segmentation, annotating pixel by pixel takes a significant amount of time and often makes mistakes. In contrast, computer graphics can generate a lot of accurate annotated data and easily scale up by changing camera positions, textures and lights. Despite these advantages, models trained on computer graphics datasets cannot perform well on real data, which is known as the domain shift. To address this issue, we first present that depth modal and synthetic noise are effective to reduce the domain shift. Then, we develop the class-wise adaptation which obtains domain invariant features of CNNs. To reduce the domain shift, we create computer graphics rooms with a lot of props, and provide photo-realistic rendered images.We also demonstrate the application which is combined semantic segmentation with Simultaneous Localization and Mapping (SLAM). Our application performs accurate 3D scene parsing in real-time on an actual room. |
Tasks | Scene Parsing, Semantic Segmentation, Simultaneous Localization and Mapping |
Published | 2018-12-10 |
URL | http://arxiv.org/abs/1812.03622v2 |
http://arxiv.org/pdf/1812.03622v2.pdf | |
PWC | https://paperswithcode.com/paper/3d-scene-parsing-via-class-wise-adaptation |
Repo | |
Framework | |
Automatic Seismic Salt Interpretation with Deep Convolutional Neural Networks
Title | Automatic Seismic Salt Interpretation with Deep Convolutional Neural Networks |
Authors | Yu Zeng, Kebei Jiang, Jie Chen |
Abstract | One of the most crucial tasks in seismic reflection imaging is to identify the salt bodies with high precision. Traditionally, this is accomplished by visually picking the salt/sediment boundaries, which requires a great amount of manual work and may introduce systematic bias. With recent progress of deep learning algorithm and growing computational power, a great deal of efforts have been made to replace human effort with machine power in salt body interpretation. Currently, the method of Convolutional neural networks (CNN) is revolutionizing the computer vision field and has been a hot topic in the image analysis. In this paper, the benefits of CNN-based classification are demonstrated by using a state-of-art network structure U-Net, along with the residual learning framework ResNet, to delineate salt body with high precision. Network adjustments, including the Exponential Linear Units (ELU) activation function, the Lov'{a}sz-Softmax loss function, and stratified $K$-fold cross-validation, have been deployed to further improve the prediction accuracy. The preliminary result using SEG Advanced Modeling (SEAM) data shows good agreement between the predicted salt body and manually interpreted salt body, especially in areas with weak reflections. This indicates the great potential of applying CNN for salt-related interpretations. |
Tasks | |
Published | 2018-11-24 |
URL | http://arxiv.org/abs/1812.01101v1 |
http://arxiv.org/pdf/1812.01101v1.pdf | |
PWC | https://paperswithcode.com/paper/automatic-seismic-salt-interpretation-with |
Repo | |
Framework | |
The Deep Kernelized Autoencoder
Title | The Deep Kernelized Autoencoder |
Authors | Michael Kampffmeyer, Sigurd Løkse, Filippo M. Bianchi, Robert Jenssen, Lorenzo Livi |
Abstract | Autoencoders learn data representations (codes) in such a way that the input is reproduced at the output of the network. However, it is not always clear what kind of properties of the input data need to be captured by the codes. Kernel machines have experienced great success by operating via inner-products in a theoretically well-defined reproducing kernel Hilbert space, hence capturing topological properties of input data. In this paper, we enhance the autoencoder’s ability to learn effective data representations by aligning inner products between codes with respect to a kernel matrix. By doing so, the proposed kernelized autoencoder allows learning similarity-preserving embeddings of input data, where the notion of similarity is explicitly controlled by the user and encoded in a positive semi-definite kernel matrix. Experiments are performed for evaluating both reconstruction and kernel alignment performance in classification tasks and visualization of high-dimensional data. Additionally, we show that our method is capable to emulate kernel principal component analysis on a denoising task, obtaining competitive results at a much lower computational cost. |
Tasks | Denoising |
Published | 2018-07-19 |
URL | http://arxiv.org/abs/1807.07868v2 |
http://arxiv.org/pdf/1807.07868v2.pdf | |
PWC | https://paperswithcode.com/paper/the-deep-kernelized-autoencoder |
Repo | |
Framework | |
Intertemporal Connections Between Query Suggestions and Search Engine Results for Politics Related Queries
Title | Intertemporal Connections Between Query Suggestions and Search Engine Results for Politics Related Queries |
Authors | Malte Bonart, Philipp Schaer |
Abstract | This short paper deals with the combination and comparison of two data sources: Search engine results and query suggestions for 16 terms related to political candidates and parties. The data was collected before the federal election in Germany in September 2017 for a period of two months. The rank biased overlap (RBO) statistic is used to measure the similarity of the top-weighted rankings. For each search term and for both the search results and query auto-completions we study the stability of the rankings over time. |
Tasks | |
Published | 2018-12-20 |
URL | http://arxiv.org/abs/1812.08585v2 |
http://arxiv.org/pdf/1812.08585v2.pdf | |
PWC | https://paperswithcode.com/paper/intertemporal-connections-between-query |
Repo | |
Framework | |
Volatility in the Issue Attention Economy
Title | Volatility in the Issue Attention Economy |
Authors | Chico Q. Camargo, Scott A. Hale, Peter John, Helen Z. Margetts |
Abstract | Recent election surprises and regime changes have left the impression that politics has become more fast-moving and unstable. While modern politics does seem more volatile, there is little systematic evidence to support this claim. This paper seeks to address this gap in knowledge by reporting data over the last seventy years using public opinion polls and traditional media data from the UK and Germany. These countries are good cases to study because both have experienced considerable changes in electoral behaviour and have new political parties during the time period studied. We measure volatility in public opinion and in media coverage using approaches from information theory, tracking the change in word-use patterns across over 700,000 articles. Our preliminary analysis suggests an increase in the number of opinion issues over time and a growth in lack of predictability of the media series from the 1970s. |
Tasks | |
Published | 2018-08-27 |
URL | http://arxiv.org/abs/1808.09037v1 |
http://arxiv.org/pdf/1808.09037v1.pdf | |
PWC | https://paperswithcode.com/paper/volatility-in-the-issue-attention-economy |
Repo | |
Framework | |
Model Adaptation with Synthetic and Real Data for Semantic Dense Foggy Scene Understanding
Title | Model Adaptation with Synthetic and Real Data for Semantic Dense Foggy Scene Understanding |
Authors | Christos Sakaridis, Dengxin Dai, Simon Hecker, Luc Van Gool |
Abstract | This work addresses the problem of semantic scene understanding under dense fog. Although considerable progress has been made in semantic scene understanding, it is mainly related to clear-weather scenes. Extending recognition methods to adverse weather conditions such as fog is crucial for outdoor applications. In this paper, we propose a novel method, named Curriculum Model Adaptation (CMAda), which gradually adapts a semantic segmentation model from light synthetic fog to dense real fog in multiple steps, using both synthetic and real foggy data. In addition, we present three other main stand-alone contributions: 1) a novel method to add synthetic fog to real, clear-weather scenes using semantic input; 2) a new fog density estimator; 3) the Foggy Zurich dataset comprising $3808$ real foggy images, with pixel-level semantic annotations for $16$ images with dense fog. Our experiments show that 1) our fog simulation slightly outperforms a state-of-the-art competing simulation with respect to the task of semantic foggy scene understanding (SFSU); 2) CMAda improves the performance of state-of-the-art models for SFSU significantly by leveraging unlabeled real foggy data. The datasets and code are publicly available. |
Tasks | Scene Understanding, Semantic Segmentation |
Published | 2018-08-03 |
URL | http://arxiv.org/abs/1808.01265v1 |
http://arxiv.org/pdf/1808.01265v1.pdf | |
PWC | https://paperswithcode.com/paper/model-adaptation-with-synthetic-and-real-data |
Repo | |
Framework | |
Curse of Heterogeneity: Computational Barriers in Sparse Mixture Models and Phase Retrieval
Title | Curse of Heterogeneity: Computational Barriers in Sparse Mixture Models and Phase Retrieval |
Authors | Jianqing Fan, Han Liu, Zhaoran Wang, Zhuoran Yang |
Abstract | We study the fundamental tradeoffs between statistical accuracy and computational tractability in the analysis of high dimensional heterogeneous data. As examples, we study sparse Gaussian mixture model, mixture of sparse linear regressions, and sparse phase retrieval model. For these models, we exploit an oracle-based computational model to establish conjecture-free computationally feasible minimax lower bounds, which quantify the minimum signal strength required for the existence of any algorithm that is both computationally tractable and statistically accurate. Our analysis shows that there exist significant gaps between computationally feasible minimax risks and classical ones. These gaps quantify the statistical price we must pay to achieve computational tractability in the presence of data heterogeneity. Our results cover the problems of detection, estimation, support recovery, and clustering, and moreover, resolve several conjectures of Azizyan et al. (2013, 2015); Verzelen and Arias-Castro (2017); Cai et al. (2016). Interestingly, our results reveal a new but counter-intuitive phenomenon in heterogeneous data analysis that more data might lead to less computation complexity. |
Tasks | |
Published | 2018-08-21 |
URL | http://arxiv.org/abs/1808.06996v1 |
http://arxiv.org/pdf/1808.06996v1.pdf | |
PWC | https://paperswithcode.com/paper/curse-of-heterogeneity-computational-barriers |
Repo | |
Framework | |
Premise selection with neural networks and distributed representation of features
Title | Premise selection with neural networks and distributed representation of features |
Authors | Andrzej Stanisław Kucik, Konstantin Korovin |
Abstract | We present the problem of selecting relevant premises for a proof of a given statement. When stated as a binary classification task for pairs (conjecture, axiom), it can be efficiently solved using artificial neural networks. The key difference between our advance to solve this problem and previous approaches is the use of just functional signatures of premises. To further improve the performance of the model, we use dimensionality reduction technique, to replace long and sparse signature vectors with their compact and dense embedded versions. These are obtained by firstly defining the concept of a context for each functor symbol, and then training a simple neural network to predict the distribution of other functor symbols in the context of this functor. After training the network, the output of its hidden layer is used to construct a lower dimensional embedding of a functional signature (for each premise) with a distributed representation of features. This allows us to use 512-dimensional embeddings for conjecture-axiom pairs, containing enough information about the original statements to reach the accuracy of 76.45% in premise selection task, only with simple two-layer densely connected neural networks. |
Tasks | Dimensionality Reduction |
Published | 2018-07-26 |
URL | http://arxiv.org/abs/1807.10268v1 |
http://arxiv.org/pdf/1807.10268v1.pdf | |
PWC | https://paperswithcode.com/paper/premise-selection-with-neural-networks-and |
Repo | |
Framework | |