July 29, 2019

2958 words 14 mins read

Paper Group AWR 193

Brain Extraction from Normal and Pathological Images: A Joint PCA/Image-Reconstruction Approach. Deep Image Harmonization. Semi-supervised sequence tagging with bidirectional language models. Implicit Weight Uncertainty in Neural Networks. Total-Text: A Comprehensive Dataset for Scene Text Detection and Recognition. The Impact of Random Models on C …

Brain Extraction from Normal and Pathological Images: A Joint PCA/Image-Reconstruction Approach


Title	Brain Extraction from Normal and Pathological Images: A Joint PCA/Image-Reconstruction Approach
Authors	Xu Han, Roland Kwitt, Stephen Aylward, Spyridon Bakas, Bjoern Menze, Alexander Asturias, Paul Vespa, John Van Horn, Marc Niethammer
Abstract	Brain extraction from images is a common pre-processing step. Many approaches exist, but they are frequently only designed to perform brain extraction from images without strong pathologies. Extracting the brain from images with strong pathologies, for example, the presence of a tumor or of a traumatic brain injury, is challenging. In such cases, tissue appearance may deviate from normal tissue and violates algorithmic assumptions for these approaches; hence, the brain may not be correctly extracted. This paper proposes a brain extraction approach which can explicitly account for pathologies by jointly modeling normal tissue and pathologies. Specifically, our model uses a three-part image decomposition: (1) normal tissue appearance is captured by principal component analysis, (2) pathologies are captured via a total variation term, and (3) non-brain tissue is captured by a sparse term. Decomposition and image registration steps are alternated to allow statistical modeling in a fixed atlas space. As a beneficial side effect, the model allows for the identification of potential pathologies and the reconstruction of a quasi-normal image in atlas space. We demonstrate the effectiveness of our method on four datasets: the IBSR and LPBA40 datasets which show normal images, the BRATS dataset containing images with brain tumors and a dataset containing clinical TBI images. We compare the performance with other popular models: ROBEX, BEaST, MASS, BET, BSE and a recently proposed deep learning approach. Our model performs better than these competing methods on all four datasets. Specifically, our model achieves the best median (97.11) and mean (96.88) Dice scores over all datasets. The two best performing competitors, ROBEX and MASS, achieve scores of 96.23/95.62 and 96.67/94.25 respectively. Hence, our approach is an effective method for high quality brain extraction on a wide variety of images.
Tasks	Image Reconstruction, Image Registration
Published	2017-11-15
URL	http://arxiv.org/abs/1711.05702v2
PDF	http://arxiv.org/pdf/1711.05702v2.pdf
PWC	https://paperswithcode.com/paper/brain-extraction-from-normal-and-pathological
Repo	https://github.com/uncbiag/pstrip
Framework	none

Deep Image Harmonization


Title	Deep Image Harmonization
Authors	Yi-Hsuan Tsai, Xiaohui Shen, Zhe Lin, Kalyan Sunkavalli, Xin Lu, Ming-Hsuan Yang
Abstract	Compositing is one of the most common operations in photo editing. To generate realistic composites, the appearances of foreground and background need to be adjusted to make them compatible. Previous approaches to harmonize composites have focused on learning statistical relationships between hand-crafted appearance features of the foreground and background, which is unreliable especially when the contents in the two layers are vastly different. In this work, we propose an end-to-end deep convolutional neural network for image harmonization, which can capture both the context and semantic information of the composite images during harmonization. We also introduce an efficient way to collect large-scale and high-quality training data that can facilitate the training process. Experiments on the synthesized dataset and real composite images show that the proposed network outperforms previous state-of-the-art methods.
Tasks
Published	2017-02-28
URL	http://arxiv.org/abs/1703.00069v1
PDF	http://arxiv.org/pdf/1703.00069v1.pdf
PWC	https://paperswithcode.com/paper/deep-image-harmonization
Repo	https://github.com/wasidennis/DeepHarmonization
Framework	caffe2

Semi-supervised sequence tagging with bidirectional language models


Title	Semi-supervised sequence tagging with bidirectional language models
Authors	Matthew E. Peters, Waleed Ammar, Chandra Bhagavatula, Russell Power
Abstract	Pre-trained word embeddings learned from unlabeled text have become a standard component of neural network architectures for NLP tasks. However, in most cases, the recurrent network that operates on word-level representations to produce context sensitive representations is trained on relatively little labeled data. In this paper, we demonstrate a general semi-supervised approach for adding pre- trained context embeddings from bidirectional language models to NLP systems and apply it to sequence labeling tasks. We evaluate our model on two standard datasets for named entity recognition (NER) and chunking, and in both cases achieve state of the art results, surpassing previous systems that use other forms of transfer or joint learning with additional labeled data and task specific gazetteers.
Tasks	Chunking, Named Entity Recognition
Published	2017-04-29
URL	http://arxiv.org/abs/1705.00108v1
PDF	http://arxiv.org/pdf/1705.00108v1.pdf
PWC	https://paperswithcode.com/paper/semi-supervised-sequence-tagging-with
Repo	https://github.com/flairNLP/flair
Framework	pytorch

Implicit Weight Uncertainty in Neural Networks


Title	Implicit Weight Uncertainty in Neural Networks
Authors	Nick Pawlowski, Andrew Brock, Matthew C. H. Lee, Martin Rajchl, Ben Glocker
Abstract	Modern neural networks tend to be overconfident on unseen, noisy or incorrectly labelled data and do not produce meaningful uncertainty measures. Bayesian deep learning aims to address this shortcoming with variational approximations (such as Bayes by Backprop or Multiplicative Normalising Flows). However, current approaches have limitations regarding flexibility and scalability. We introduce Bayes by Hypernet (BbH), a new method of variational approximation that interprets hypernetworks as implicit distributions. It naturally uses neural networks to model arbitrarily complex distributions and scales to modern deep learning architectures. In our experiments, we demonstrate that our method achieves competitive accuracies and predictive uncertainties on MNIST and a CIFAR5 task, while being the most robust against adversarial attacks.
Tasks	Normalising Flows
Published	2017-11-03
URL	http://arxiv.org/abs/1711.01297v2
PDF	http://arxiv.org/pdf/1711.01297v2.pdf
PWC	https://paperswithcode.com/paper/implicit-weight-uncertainty-in-neural
Repo	https://github.com/Aiqz/bayes-by-hypernet
Framework	tf

Total-Text: A Comprehensive Dataset for Scene Text Detection and Recognition


Title	Total-Text: A Comprehensive Dataset for Scene Text Detection and Recognition
Authors	Chee Kheng Chng, Chee Seng Chan
Abstract	Text in curve orientation, despite being one of the common text orientations in real world environment, has close to zero existence in well received scene text datasets such as ICDAR2013 and MSRA-TD500. The main motivation of Total-Text is to fill this gap and facilitate a new research direction for the scene text community. On top of the conventional horizontal and multi-oriented texts, it features curved-oriented text. Total-Text is highly diversified in orientations, more than half of its images have a combination of more than two orientations. Recently, a new breed of solutions that casted text detection as a segmentation problem has demonstrated their effectiveness against multi-oriented text. In order to evaluate its robustness against curved text, we fine-tuned DeconvNet and benchmark it on Total-Text. Total-Text with its annotation is available at https://github.com/cs-chan/Total-Text-Dataset
Tasks	Curved Text Detection, Scene Text Detection, Scene Text Recognition
Published	2017-10-28
URL	http://arxiv.org/abs/1710.10400v1
PDF	http://arxiv.org/pdf/1710.10400v1.pdf
PWC	https://paperswithcode.com/paper/total-text-a-comprehensive-dataset-for-scene
Repo	https://github.com/cs-chan/Total-Text-Dataset
Framework	none

The Impact of Random Models on Clustering Similarity


Title	The Impact of Random Models on Clustering Similarity
Authors	Alexander J Gates, Yong-Yeol Ahn
Abstract	Clustering is a central approach for unsupervised learning. After clustering is applied, the most fundamental analysis is to quantitatively compare clusterings. Such comparisons are crucial for the evaluation of clustering methods as well as other tasks such as consensus clustering. It is often argued that, in order to establish a baseline, clustering similarity should be assessed in the context of a random ensemble of clusterings. The prevailing assumption for the random clustering ensemble is the permutation model in which the number and sizes of clusters are fixed. However, this assumption does not necessarily hold in practice; for example, multiple runs of K-means clustering returns clusterings with a fixed number of clusters, while the cluster size distribution varies greatly. Here, we derive corrected variants of two clustering similarity measures (the Rand index and Mutual Information) in the context of two random clustering ensembles in which the number and sizes of clusters vary. In addition, we study the impact of one-sided comparisons in the scenario with a reference clustering. The consequences of different random models are illustrated using synthetic examples, handwriting recognition, and gene expression data. We demonstrate that the choice of random model can have a drastic impact on the ranking of similar clustering pairs, and the evaluation of a clustering method with respect to a random baseline; thus, the choice of random clustering model should be carefully justified.
Tasks
Published	2017-01-23
URL	http://arxiv.org/abs/1701.06508v2
PDF	http://arxiv.org/pdf/1701.06508v2.pdf
PWC	https://paperswithcode.com/paper/the-impact-of-random-models-on-clustering
Repo	https://github.com/ajgates42/clusim
Framework	none

The Monkeytyping Solution to the YouTube-8M Video Understanding Challenge


Title	The Monkeytyping Solution to the YouTube-8M Video Understanding Challenge
Authors	He-Da Wang, Teng Zhang, Ji Wu
Abstract	This article describes the final solution of team monkeytyping, who finished in second place in the YouTube-8M video understanding challenge. The dataset used in this challenge is a large-scale benchmark for multi-label video classification. We extend the work in [1] and propose several improvements for frame sequence modeling. We propose a network structure called Chaining that can better capture the interactions between labels. Also, we report our approaches in dealing with multi-scale information and attention pooling. In addition, We find that using the output of model ensemble as a side target in training can boost single model performance. We report our experiments in bagging, boosting, cascade, and stacking, and propose a stacking algorithm called attention weighted stacking. Our final submission is an ensemble that consists of 74 sub models, all of which are listed in the appendix.
Tasks	Video Classification, Video Understanding
Published	2017-06-16
URL	http://arxiv.org/abs/1706.05150v1
PDF	http://arxiv.org/pdf/1706.05150v1.pdf
PWC	https://paperswithcode.com/paper/the-monkeytyping-solution-to-the-youtube-8m
Repo	https://github.com/wangheda/youtube-8m
Framework	tf

Proximal Policy Optimization Algorithms


Title	Proximal Policy Optimization Algorithms
Authors	John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, Oleg Klimov
Abstract	We propose a new family of policy gradient methods for reinforcement learning, which alternate between sampling data through interaction with the environment, and optimizing a “surrogate” objective function using stochastic gradient ascent. Whereas standard policy gradient methods perform one gradient update per data sample, we propose a novel objective function that enables multiple epochs of minibatch updates. The new methods, which we call proximal policy optimization (PPO), have some of the benefits of trust region policy optimization (TRPO), but they are much simpler to implement, more general, and have better sample complexity (empirically). Our experiments test PPO on a collection of benchmark tasks, including simulated robotic locomotion and Atari game playing, and we show that PPO outperforms other online policy gradient methods, and overall strikes a favorable balance between sample complexity, simplicity, and wall-time.
Tasks	Dota 2, Policy Gradient Methods
Published	2017-07-20
URL	http://arxiv.org/abs/1707.06347v2
PDF	http://arxiv.org/pdf/1707.06347v2.pdf
PWC	https://paperswithcode.com/paper/proximal-policy-optimization-algorithms
Repo	https://github.com/clwainwright/proximal_policy_optimization
Framework	tf

Scalable Log Determinants for Gaussian Process Kernel Learning


Title	Scalable Log Determinants for Gaussian Process Kernel Learning
Authors	Kun Dong, David Eriksson, Hannes Nickisch, David Bindel, Andrew Gordon Wilson
Abstract	For applications as varied as Bayesian neural networks, determinantal point processes, elliptical graphical models, and kernel learning for Gaussian processes (GPs), one must compute a log determinant of an $n \times n$ positive definite matrix, and its derivatives - leading to prohibitive $\mathcal{O}(n^3)$ computations. We propose novel $\mathcal{O}(n)$ approaches to estimating these quantities from only fast matrix vector multiplications (MVMs). These stochastic approximations are based on Chebyshev, Lanczos, and surrogate models, and converge quickly even for kernel matrices that have challenging spectra. We leverage these approximations to develop a scalable Gaussian process approach to kernel learning. We find that Lanczos is generally superior to Chebyshev for kernel learning, and that a surrogate approach can be highly efficient and accurate with popular kernels.
Tasks	Gaussian Processes, Point Processes
Published	2017-11-09
URL	http://arxiv.org/abs/1711.03481v1
PDF	http://arxiv.org/pdf/1711.03481v1.pdf
PWC	https://paperswithcode.com/paper/scalable-log-determinants-for-gaussian
Repo	https://github.com/ericlee0803/GP_Derivatives
Framework	none

Unsupervised Image-to-Image Translation Networks


Title	Unsupervised Image-to-Image Translation Networks
Authors	Ming-Yu Liu, Thomas Breuel, Jan Kautz
Abstract	Unsupervised image-to-image translation aims at learning a joint distribution of images in different domains by using images from the marginal distributions in individual domains. Since there exists an infinite set of joint distributions that can arrive the given marginal distributions, one could infer nothing about the joint distribution from the marginal distributions without additional assumptions. To address the problem, we make a shared-latent space assumption and propose an unsupervised image-to-image translation framework based on Coupled GANs. We compare the proposed framework with competing approaches and present high quality image translation results on various challenging unsupervised image translation tasks, including street scene image translation, animal image translation, and face image translation. We also apply the proposed framework to domain adaptation and achieve state-of-the-art performance on benchmark datasets. Code and additional results are available in https://github.com/mingyuliutw/unit .
Tasks	Domain Adaptation, Image-to-Image Translation, Multimodal Unsupervised Image-To-Image Translation, Unsupervised Image-To-Image Translation
Published	2017-03-02
URL	http://arxiv.org/abs/1703.00848v6
PDF	http://arxiv.org/pdf/1703.00848v6.pdf
PWC	https://paperswithcode.com/paper/unsupervised-image-to-image-translation
Repo	https://github.com/taki0112/UNIT-Tensorflow
Framework	tf

AutoEncoder Inspired Unsupervised Feature Selection


Title	AutoEncoder Inspired Unsupervised Feature Selection
Authors	Kai Han, Yunhe Wang, Chao Zhang, Chao Li, Chao Xu
Abstract	High-dimensional data in many areas such as computer vision and machine learning tasks brings in computational and analytical difficulty. Feature selection which selects a subset from observed features is a widely used approach for improving performance and effectiveness of machine learning models with high-dimensional data. In this paper, we propose a novel AutoEncoder Feature Selector (AEFS) for unsupervised feature selection which combines autoencoder regression and group lasso tasks. Compared to traditional feature selection methods, AEFS can select the most important features by excavating both linear and nonlinear information among features, which is more flexible than the conventional self-representation method for unsupervised feature selection with only linear assumptions. Experimental results on benchmark dataset show that the proposed method is superior to the state-of-the-art method.
Tasks	Feature Selection
Published	2017-10-23
URL	http://arxiv.org/abs/1710.08310v3
PDF	http://arxiv.org/pdf/1710.08310v3.pdf
PWC	https://paperswithcode.com/paper/autoencoder-inspired-unsupervised-feature
Repo	https://github.com/NoahLuffy/AEFS
Framework	none

DLTK: State of the Art Reference Implementations for Deep Learning on Medical Images


Title	DLTK: State of the Art Reference Implementations for Deep Learning on Medical Images
Authors	Nick Pawlowski, Sofia Ira Ktena, Matthew C. H. Lee, Bernhard Kainz, Daniel Rueckert, Ben Glocker, Martin Rajchl
Abstract	We present DLTK, a toolkit providing baseline implementations for efficient experimentation with deep learning methods on biomedical images. It builds on top of TensorFlow and its high modularity and easy-to-use examples allow for a low-threshold access to state-of-the-art implementations for typical medical imaging problems. A comparison of DLTK’s reference implementations of popular network architectures for image segmentation demonstrates new top performance on the publicly available challenge data “Multi-Atlas Labeling Beyond the Cranial Vault”. The average test Dice similarity coefficient of $81.5$ exceeds the previously best performing CNN ($75.7$) and the accuracy of the challenge winning method ($79.0$).
Tasks	Semantic Segmentation
Published	2017-11-18
URL	http://arxiv.org/abs/1711.06853v1
PDF	http://arxiv.org/pdf/1711.06853v1.pdf
PWC	https://paperswithcode.com/paper/dltk-state-of-the-art-reference
Repo	https://github.com/DLTK/DLTK
Framework	tf

Lip2AudSpec: Speech reconstruction from silent lip movements video


Title	Lip2AudSpec: Speech reconstruction from silent lip movements video
Authors	Hassan Akbari, Himani Arora, Liangliang Cao, Nima Mesgarani
Abstract	In this study, we propose a deep neural network for reconstructing intelligible speech from silent lip movement videos. We use auditory spectrogram as spectral representation of speech and its corresponding sound generation method resulting in a more natural sounding reconstructed speech. Our proposed network consists of an autoencoder to extract bottleneck features from the auditory spectrogram which is then used as target to our main lip reading network comprising of CNN, LSTM and fully connected layers. Our experiments show that the autoencoder is able to reconstruct the original auditory spectrogram with a 98% correlation and also improves the quality of reconstructed speech from the main lip reading network. Our model, trained jointly on different speakers is able to extract individual speaker characteristics and gives promising results of reconstructing intelligible speech with superior word recognition accuracy.
Tasks
Published	2017-10-26
URL	http://arxiv.org/abs/1710.09798v1
PDF	http://arxiv.org/pdf/1710.09798v1.pdf
PWC	https://paperswithcode.com/paper/lip2audspec-speech-reconstruction-from-silent
Repo	https://github.com/hassanhub/LipReading
Framework	tf

Location Name Extraction from Targeted Text Streams using Gazetteer-based Statistical Language Models


Title	Location Name Extraction from Targeted Text Streams using Gazetteer-based Statistical Language Models
Authors	Hussein S. Al-Olimat, Krishnaprasad Thirunarayan, Valerie Shalin, Amit Sheth
Abstract	Extracting location names from informal and unstructured social media data requires the identification of referent boundaries and partitioning compound names. Variability, particularly systematic variability in location names (Carroll, 1983), challenges the identification task. Some of this variability can be anticipated as operations within a statistical language model, in this case drawn from gazetteers such as OpenStreetMap (OSM), Geonames, and DBpedia. This permits evaluation of an observed n-gram in Twitter targeted text as a legitimate location name variant from the same location-context. Using n-gram statistics and location-related dictionaries, our Location Name Extraction tool (LNEx) handles abbreviations and automatically filters and augments the location names in gazetteers (handling name contractions and auxiliary contents) to help detect the boundaries of multi-word location names and thereby delimit them in texts. We evaluated our approach on 4,500 event-specific tweets from three targeted streams to compare the performance of LNEx against that of ten state-of-the-art taggers that rely on standard semantic, syntactic and/or orthographic features. LNEx improved the average F-Score by 33-179%, outperforming all taggers. Further, LNEx is capable of stream processing.
Tasks	Language Modelling
Published	2017-08-10
URL	http://arxiv.org/abs/1708.03105v2
PDF	http://arxiv.org/pdf/1708.03105v2.pdf
PWC	https://paperswithcode.com/paper/location-name-extraction-from-targeted-text
Repo	https://github.com/halolimat/LNEx
Framework	none

A Chinese Dataset with Negative Full Forms for General Abbreviation Prediction


Title	A Chinese Dataset with Negative Full Forms for General Abbreviation Prediction
Authors	Yi Zhang, Xu Sun
Abstract	Abbreviation is a common phenomenon across languages, especially in Chinese. In most cases, if an expression can be abbreviated, its abbreviation is used more often than its fully expanded forms, since people tend to convey information in a most concise way. For various language processing tasks, abbreviation is an obstacle to improving the performance, as the textual form of an abbreviation does not express useful information, unless it’s expanded to the full form. Abbreviation prediction means associating the fully expanded forms with their abbreviations. However, due to the deficiency in the abbreviation corpora, such a task is limited in current studies, especially considering general abbreviation prediction should also include those full form expressions that do not have valid abbreviations, namely the negative full forms (NFFs). Corpora incorporating negative full forms for general abbreviation prediction are few in number. In order to promote the research in this area, we build a dataset for general Chinese abbreviation prediction, which needs a few preprocessing steps, and evaluate several different models on the built dataset. The dataset is available at https://github.com/lancopku/Chinese-abbreviation-dataset
Tasks
Published	2017-12-18
URL	http://arxiv.org/abs/1712.06289v1
PDF	http://arxiv.org/pdf/1712.06289v1.pdf
PWC	https://paperswithcode.com/paper/a-chinese-dataset-with-negative-full-forms
Repo	https://github.com/lancopku/Chinese-abbreviation-dataset
Framework	none