October 17, 2019

3075 words 15 mins read

Paper Group ANR 717

Paper Group ANR 717

Unsupervised Image Decomposition in Vector Layers. Efficient Traffic-Sign Recognition with Scale-aware CNN. Features of word similarity. Stochastic model-based minimization of weakly convex functions. Hierarchical multi-class segmentation of glioma images using networks with multi-level activation function. Sequential Embedding Induced Text Cluster …

Unsupervised Image Decomposition in Vector Layers

Title Unsupervised Image Decomposition in Vector Layers
Authors Othman Sbai, Camille Couprie, Mathieu Aubry
Abstract Deep image generation is becoming a tool to enhance artists and designers creativity potential. In this paper, we aim at making the generation process more structured and easier to interact with. Inspired by vector graphics systems, we propose a new deep image reconstruction paradigm where the outputs are composed from simple layers, defined by their color and a vector transparency mask. This presents a number of advantages compared to the commonly used convolutional network architectures. In particular, our layered decomposition allows simple user interaction, for example to update a given mask, or change the color of a selected layer. From a compact code, our architecture also generates vector images with a virtually infinite resolution, the color at each point in an image being a parametric function of its coordinates. We validate the efficiency of our approach by comparing reconstructions with state-of-the-art baselines given similar memory resources on CelebA and ImageNet datasets. Most importantly, we demonstrate several applications of our new image representation obtained in an unsupervised manner, including editing, vectorization and image search.
Tasks Image Generation, Image Reconstruction, Image Retrieval
Published 2018-12-13
URL https://arxiv.org/abs/1812.05484v2
PDF https://arxiv.org/pdf/1812.05484v2.pdf
PWC https://paperswithcode.com/paper/vector-image-generation-by-learning
Repo
Framework

Efficient Traffic-Sign Recognition with Scale-aware CNN

Title Efficient Traffic-Sign Recognition with Scale-aware CNN
Authors Yuchen Yang, Shuo Liu, Wei Ma, Qiuyuan Wang, Zheng Liu
Abstract The paper presents a Traffic Sign Recognition (TSR) system, which can fast and accurately recognize traffic signs of different sizes in images. The system consists of two well-designed Convolutional Neural Networks (CNNs), one for region proposals of traffic signs and one for classification of each region. In the proposal CNN, a Fully Convolutional Network (FCN) with a dual multi-scale architecture is proposed to achieve scale invariant detection. In training the proposal network, a modified “Online Hard Example Mining” (OHEM) scheme is adopted to suppress false positives. The classification network fuses multi-scale features as representation and adopts an “Inception” module for efficiency. We evaluate the proposed TSR system and its components with extensive experiments. Our method obtains $99.88%$ precision and $96.61%$ recall on the Swedish Traffic Signs Dataset (STSD), higher than state-of-the-art methods. Besides, our system is faster and more lightweight than state-of-the-art deep learning networks for traffic sign recognition.
Tasks Traffic Sign Recognition
Published 2018-05-31
URL http://arxiv.org/abs/1805.12289v1
PDF http://arxiv.org/pdf/1805.12289v1.pdf
PWC https://paperswithcode.com/paper/efficient-traffic-sign-recognition-with-scale
Repo
Framework

Features of word similarity

Title Features of word similarity
Authors Arthur M. Jacobs, Annette Kinder
Abstract In this theoretical note we compare different types of computational models of word similarity and association in their ability to predict a set of about 900 rating data. Using regression and predictive modeling tools (neural net, decision tree) the performance of a total of 28 models using different combinations of both surface and semantic word features is evaluated. The results present evidence for the hypothesis that word similarity ratings are based on more than only semantic relatedness. The limited cross-validated performance of the models asks for the development of psychological process models of the word similarity rating task.
Tasks
Published 2018-08-24
URL http://arxiv.org/abs/1808.07999v1
PDF http://arxiv.org/pdf/1808.07999v1.pdf
PWC https://paperswithcode.com/paper/features-of-word-similarity
Repo
Framework

Stochastic model-based minimization of weakly convex functions

Title Stochastic model-based minimization of weakly convex functions
Authors Damek Davis, Dmitriy Drusvyatskiy
Abstract We consider a family of algorithms that successively sample and minimize simple stochastic models of the objective function. We show that under reasonable conditions on approximation quality and regularity of the models, any such algorithm drives a natural stationarity measure to zero at the rate $O(k^{-1/4})$. As a consequence, we obtain the first complexity guarantees for the stochastic proximal point, proximal subgradient, and regularized Gauss-Newton methods for minimizing compositions of convex functions with smooth maps. The guiding principle, underlying the complexity guarantees, is that all algorithms under consideration can be interpreted as approximate descent methods on an implicit smoothing of the problem, given by the Moreau envelope. Specializing to classical circumstances, we obtain the long-sought convergence rate of the stochastic projected gradient method, without batching, for minimizing a smooth function on a closed convex set.
Tasks
Published 2018-03-17
URL http://arxiv.org/abs/1803.06523v3
PDF http://arxiv.org/pdf/1803.06523v3.pdf
PWC https://paperswithcode.com/paper/stochastic-model-based-minimization-of-weakly
Repo
Framework

Hierarchical multi-class segmentation of glioma images using networks with multi-level activation function

Title Hierarchical multi-class segmentation of glioma images using networks with multi-level activation function
Authors Xiaobin Hu, Hongwei Li, Yu Zhao, Chao Dong, Bjoern H. Menze, Marie Piraud
Abstract For many segmentation tasks, especially for the biomedical image, the topological prior is vital information which is useful to exploit. The containment/nesting is a typical inter-class geometric relationship. In the MICCAI Brain tumor segmentation challenge, with its three hierarchically nested classes ‘whole tumor’, ‘tumor core’, ‘active tumor’, the nested classes relationship is introduced into the 3D-residual-Unet architecture. The network comprises a context aggregation pathway and a localization pathway, which encodes increasingly abstract representation of the input as going deeper into the network, and then recombines these representations with shallower features to precisely localize the interest domain via a localization path. The nested-class-prior is combined by proposing the multi-class activation function and its corresponding loss function. The model is trained on the training dataset of Brats2018, and 20% of the dataset is regarded as the validation dataset to determine parameters. When the parameters are fixed, we retrain the model on the whole training dataset. The performance achieved on the validation leaderboard is 86%, 77% and 72% Dice scores for the whole tumor, enhancing tumor and tumor core classes without relying on ensembles or complicated post-processing steps. Based on the same start-of-the-art network architecture, the accuracy of nested-class (enhancing tumor) is reasonably improved from 69% to 72% compared with the traditional Softmax-based method which blind to topological prior.
Tasks Brain Tumor Segmentation
Published 2018-10-22
URL http://arxiv.org/abs/1810.09488v2
PDF http://arxiv.org/pdf/1810.09488v2.pdf
PWC https://paperswithcode.com/paper/hierarchical-multi-class-segmentation-of
Repo
Framework

Sequential Embedding Induced Text Clustering, a Non-parametric Bayesian Approach

Title Sequential Embedding Induced Text Clustering, a Non-parametric Bayesian Approach
Authors Tiehang Duan, Qi Lou, Sargur N. Srihari, Xiaohui Xie
Abstract Current state-of-the-art nonparametric Bayesian text clustering methods model documents through multinomial distribution on bags of words. Although these methods can effectively utilize the word burstiness representation of documents and achieve decent performance, they do not explore the sequential information of text and relationships among synonyms. In this paper, the documents are modeled as the joint of bags of words, sequential features and word embeddings. We proposed Sequential Embedding induced Dirichlet Process Mixture Model (SiDPMM) to effectively exploit this joint document representation in text clustering. The sequential features are extracted by the encoder-decoder component. Word embeddings produced by the continuous-bag-of-words (CBOW) model are introduced to handle synonyms. Experimental results demonstrate the benefits of our model in two major aspects: 1) improved performance across multiple diverse text datasets in terms of the normalized mutual information (NMI); 2) more accurate inference of ground truth cluster numbers with regularization effect on tiny outlier clusters.
Tasks Text Clustering, Word Embeddings
Published 2018-11-29
URL http://arxiv.org/abs/1811.12500v1
PDF http://arxiv.org/pdf/1811.12500v1.pdf
PWC https://paperswithcode.com/paper/sequential-embedding-induced-text-clustering
Repo
Framework

Combining Heterogeneously Labeled Datasets For Training Segmentation Networks

Title Combining Heterogeneously Labeled Datasets For Training Segmentation Networks
Authors Jana Kemnitz, Christian F. Baumgartner, Wolfgang Wirth, Felix Eckstein, Sebastian K. Eder, Ender Konukoglu
Abstract Accurate segmentation of medical images is an important step towards analyzing and tracking disease related morphological alterations in the anatomy. Convolutional neural networks (CNNs) have recently emerged as a powerful tool for many segmentation tasks in medical imaging. The performance of CNNs strongly depends on the size of the training data and combining data from different sources is an effective strategy for obtaining larger training datasets. However, this is often challenged by heterogeneous labeling of the datasets. For instance, one of the dataset may be missing labels or a number of labels may have been combined into a super label. In this work we propose a cost function which allows integration of multiple datasets with heterogeneous label subsets into a joint training. We evaluated the performance of this strategy on thigh MR and a cardiac MR datasets in which we artificially merged labels for half of the data. We found the proposed cost function substantially outperforms a naive masking approach, obtaining results very close to using the full annotations.
Tasks
Published 2018-07-24
URL http://arxiv.org/abs/1807.08935v1
PDF http://arxiv.org/pdf/1807.08935v1.pdf
PWC https://paperswithcode.com/paper/combining-heterogeneously-labeled-datasets
Repo
Framework

Particle Swarm Optimization: A survey of historical and recent developments with hybridization perspectives

Title Particle Swarm Optimization: A survey of historical and recent developments with hybridization perspectives
Authors Saptarshi Sengupta, Sanchita Basak, Richard Alan Peters II
Abstract Particle Swarm Optimization (PSO) is a metaheuristic global optimization paradigm that has gained prominence in the last two decades due to its ease of application in unsupervised, complex multidimensional problems which cannot be solved using traditional deterministic algorithms. The canonical particle swarm optimizer is based on the flocking behavior and social co-operation of birds and fish schools and draws heavily from the evolutionary behavior of these organisms. This paper serves to provide a thorough survey of the PSO algorithm with special emphasis on the development, deployment and improvements of its most basic as well as some of the state-of-the-art implementations. Concepts and directions on choosing the inertia weight, constriction factor, cognition and social weights and perspectives on convergence, parallelization, elitism, niching and discrete optimization as well as neighborhood topologies are outlined. Hybridization attempts with other evolutionary and swarm paradigms in selected applications are covered and an up-to-date review is put forward for the interested reader.
Tasks
Published 2018-04-15
URL http://arxiv.org/abs/1804.05319v2
PDF http://arxiv.org/pdf/1804.05319v2.pdf
PWC https://paperswithcode.com/paper/particle-swarm-optimization-a-survey-of
Repo
Framework

Automatic Brain Tumor Segmentation using Convolutional Neural Networks with Test-Time Augmentation

Title Automatic Brain Tumor Segmentation using Convolutional Neural Networks with Test-Time Augmentation
Authors Guotai Wang, Wenqi Li, Sebastien Ourselin, Tom Vercauteren
Abstract Automatic brain tumor segmentation plays an important role for diagnosis, surgical planning and treatment assessment of brain tumors. Deep convolutional neural networks (CNNs) have been widely used for this task. Due to the relatively small data set for training, data augmentation at training time has been commonly used for better performance of CNNs. Recent works also demonstrated the usefulness of using augmentation at test time, in addition to training time, for achieving more robust predictions. We investigate how test-time augmentation can improve CNNs’ performance for brain tumor segmentation. We used different underpinning network structures and augmented the image by 3D rotation, flipping, scaling and adding random noise at both training and test time. Experiments with BraTS 2018 training and validation set show that test-time augmentation helps to improve the brain tumor segmentation accuracy and obtain uncertainty estimation of the segmentation results.
Tasks Brain Tumor Segmentation, Data Augmentation
Published 2018-10-18
URL http://arxiv.org/abs/1810.07884v2
PDF http://arxiv.org/pdf/1810.07884v2.pdf
PWC https://paperswithcode.com/paper/automatic-brain-tumor-segmentation-using
Repo
Framework

Texture Classification in Extreme Scale Variations using GANet

Title Texture Classification in Extreme Scale Variations using GANet
Authors Li Liu, Jie Chen, Guoying Zhao, Paul Fieguth, Xilin Chen, Matti Pietikäinen
Abstract Research in texture recognition often concentrates on recognizing textures with intraclass variations such as illumination, rotation, viewpoint and small scale changes. In contrast, in real-world applications a change in scale can have a dramatic impact on texture appearance, to the point of changing completely from one texture category to another. As a result, texture variations due to changes in scale are amongst the hardest to handle. In this work we conduct the first study of classifying textures with extreme variations in scale. To address this issue, we first propose and then reduce scale proposals on the basis of dominant texture patterns. Motivated by the challenges posed by this problem, we propose a new GANet network where we use a Genetic Algorithm to change the units in the hidden layers during network training, in order to promote the learning of more informative semantic texture patterns. Finally, we adopt a FVCNN (Fisher Vector pooling of a Convolutional Neural Network filter bank) feature encoder for global texture representation. Because extreme scale variations are not necessarily present in most standard texture databases, to support the proposed extreme-scale aspects of texture understanding we are developing a new dataset, the Extreme Scale Variation Textures (ESVaT), to test the performance of our framework. It is demonstrated that the proposed framework significantly outperforms gold-standard texture features by more than 10% on ESVaT. We also test the performance of our proposed approach on the KTHTIPS2b and OS datasets and a further dataset synthetically derived from Forrest, showing superior performance compared to the state of the art.
Tasks Texture Classification
Published 2018-02-13
URL http://arxiv.org/abs/1802.04441v1
PDF http://arxiv.org/pdf/1802.04441v1.pdf
PWC https://paperswithcode.com/paper/texture-classification-in-extreme-scale
Repo
Framework

Topic Modeling Based Multi-modal Depression Detection

Title Topic Modeling Based Multi-modal Depression Detection
Authors Yuan Gong, Christian Poellabauer
Abstract Major depressive disorder is a common mental disorder that affects almost 7% of the adult U.S. population. The 2017 Audio/Visual Emotion Challenge (AVEC) asks participants to build a model to predict depression levels based on the audio, video, and text of an interview ranging between 7-33 minutes. Since averaging features over the entire interview will lose most temporal information, how to discover, capture, and preserve useful temporal details for such a long interview are significant challenges. Therefore, we propose a novel topic modeling based approach to perform context-aware analysis of the recording. Our experiments show that the proposed approach outperforms context-unaware methods and the challenge baselines for all metrics.
Tasks
Published 2018-03-28
URL http://arxiv.org/abs/1803.10384v1
PDF http://arxiv.org/pdf/1803.10384v1.pdf
PWC https://paperswithcode.com/paper/topic-modeling-based-multi-modal-depression
Repo
Framework

Fully Implicit Online Learning

Title Fully Implicit Online Learning
Authors Chaobing Song, Ji Liu, Han Liu, Yong Jiang, Tong Zhang
Abstract Regularized online learning is widely used in machine learning applications. In online learning, performing exact minimization ($i.e.,$ implicit update) is known to be beneficial to the numerical stability and structure of solution. In this paper we study a class of regularized online algorithms without linearizing the loss function or the regularizer, which we call \emph{fully implicit online learning} (FIOL). We show that for arbitrary Bregman divergence, FIOL has the $O(\sqrt{T})$ regret for general convex setting and $O(\log T)$ regret for strongly convex setting, and the regret has an one-step improvement effect because it avoids the approximation error of linearization. Then we propose efficient algorithms to solve the subproblem of FIOL. We show that even if the solution of the subproblem has no closed form, it can be solved with complexity comparable to the linearized online algoritms. Experiments validate the proposed approaches.
Tasks
Published 2018-09-25
URL http://arxiv.org/abs/1809.09350v3
PDF http://arxiv.org/pdf/1809.09350v3.pdf
PWC https://paperswithcode.com/paper/fully-implicit-online-learning
Repo
Framework

Scene-LSTM: A Model for Human Trajectory Prediction

Title Scene-LSTM: A Model for Human Trajectory Prediction
Authors Huynh Manh, Gita Alaghband
Abstract We develop a human movement trajectory prediction system that incorporates the scene information (Scene-LSTM) as well as human movement trajectories (Pedestrian movement LSTM) in the prediction process within static crowded scenes. We superimpose a two-level grid structure (scene is divided into grid cells each modeled by a scene-LSTM, which are further divided into smaller sub-grids for finer spatial granularity) and explore common human trajectories occurring in the grid cell (e.g., making a right or left turn onto sidewalks coming out of an alley; or standing still at bus/train stops). Two coupled LSTM networks, Pedestrian movement LSTMs (one per target) and the corresponding Scene-LSTMs (one per grid-cell) are trained simultaneously to predict the next movements. We show that such common path information greatly influences prediction of future movement. We further design a scene data filter that holds important non-linear movement information. The scene data filter allows us to select the relevant parts of the information from the grid cell’s memory relative to a target’s state. We evaluate and compare two versions of our method with the Linear and several existing LSTM-based methods on five crowded video sequences from the UCY [1] and ETH [2] datasets. The results show that our method reduces the location displacement errors compared to related methods and specifically about 80% reduction compared to social interaction methods.
Tasks Trajectory Prediction
Published 2018-08-12
URL http://arxiv.org/abs/1808.04018v2
PDF http://arxiv.org/pdf/1808.04018v2.pdf
PWC https://paperswithcode.com/paper/scene-lstm-a-model-for-human-trajectory
Repo
Framework

Deep Recurrent Level Set for Segmenting Brain Tumors

Title Deep Recurrent Level Set for Segmenting Brain Tumors
Authors T. Hoang Ngan Le, Raajitha Gummadi, Marios Savvides
Abstract Variational Level Set (VLS) has been a widely used method in medical segmentation. However, segmentation accuracy in the VLS method dramatically decreases when dealing with intervening factors such as lighting, shadows, colors, etc. Additionally, results are quite sensitive to initial settings and are highly dependent on the number of iterations. In order to address these limitations, the proposed method incorporates VLS into deep learning by defining a novel end-to-end trainable model called as Deep Recurrent Level Set (DRLS). The proposed DRLS consists of three layers, i.e, Convolutional layers, Deconvolutional layers with skip connections and LevelSet layers. Brain tumor segmentation is taken as an instant to illustrate the performance of the proposed DRLS. Convolutional layer learns visual representation of brain tumor at different scales. Since brain tumors occupy a small portion of the image, deconvolutional layers are designed with skip connections to obtain a high quality feature map. Level-Set Layer drives the contour towards the brain tumor. In each step, the Convolutional Layer is fed with the LevelSet map to obtain a brain tumor feature map. This in turn serves as input for the LevelSet layer in the next step. The experimental results have been obtained on BRATS2013, BRATS2015 and BRATS2017 datasets. The proposed DRLS model improves both computational time and segmentation accuracy when compared to the the classic VLS-based method. Additionally, a fully end-to-end system DRLS achieves state-of-the-art segmentation on brain tumors.
Tasks Brain Tumor Segmentation
Published 2018-10-10
URL http://arxiv.org/abs/1810.04752v1
PDF http://arxiv.org/pdf/1810.04752v1.pdf
PWC https://paperswithcode.com/paper/deep-recurrent-level-set-for-segmenting-brain
Repo
Framework

Multi Modal Convolutional Neural Networks for Brain Tumor Segmentation

Title Multi Modal Convolutional Neural Networks for Brain Tumor Segmentation
Authors Mehmet Aygün, Yusuf Hüseyin Şahin, Gözde Ünal
Abstract In this work, we propose a multi-modal Convolutional Neural Network (CNN) approach for brain tumor segmentation. We investigate how to combine different modalities efficiently in the CNN framework.We adapt various fusion methods, which are previously employed on video recognition problem, to the brain tumor segmentation problem,and we investigate their efficiency in terms of memory and performance.Our experiments, which are performed on BRATS dataset, lead us to the conclusion that learning separate representations for each modality and combining them for brain tumor segmentation could increase the performance of CNN systems.
Tasks Brain Tumor Segmentation, Video Recognition
Published 2018-09-17
URL http://arxiv.org/abs/1809.06191v2
PDF http://arxiv.org/pdf/1809.06191v2.pdf
PWC https://paperswithcode.com/paper/multi-modal-convolutional-neural-networks-for
Repo
Framework
comments powered by Disqus