October 19, 2019

2944 words 14 mins read

Paper Group ANR 288

Real-time Monocular Visual Odometry for Turbid and Dynamic Underwater Environments. Adversarial Label Learning. A Proposal of Interactive Growing Hierarchical SOM. End-to-End Sound Source Separation Conditioned On Instrument Labels. Video Smoke Detection Based on Deep Saliency Network. Neural Image Compression for Gigapixel Histopathology Image Ana …

Real-time Monocular Visual Odometry for Turbid and Dynamic Underwater Environments


Title	Real-time Monocular Visual Odometry for Turbid and Dynamic Underwater Environments
Authors	Maxime Ferrera, Julien Moras, Pauline Trouvé-Peloux, Vincent Creuze
Abstract	In the context of robotic underwater operations, the visual degradations induced by the medium properties make difficult the exclusive use of cameras for localization purpose. Hence, most localization methods are based on expensive navigational sensors associated with acoustic positioning. On the other hand, visual odometry and visual SLAM have been exhaustively studied for aerial or terrestrial applications, but state-of-the-art algorithms fail underwater. In this paper we tackle the problem of using a simple low-cost camera for underwater localization and propose a new monocular visual odometry method dedicated to the underwater environment. We evaluate different tracking methods and show that optical flow based tracking is more suited to underwater images than classical approaches based on descriptors. We also propose a keyframe-based visual odometry approach highly relying on nonlinear optimization. The proposed algorithm has been assessed on both simulated and real underwater datasets and outperforms state-of-the-art visual SLAM methods under many of the most challenging conditions. The main application of this work is the localization of Remotely Operated Vehicles (ROVs) used for underwater archaeological missions but the developed system can be used in any other applications as long as visual information is available.
Tasks	Monocular Visual Odometry, Optical Flow Estimation, Visual Odometry
Published	2018-06-15
URL	https://arxiv.org/abs/1806.05842v3
PDF	https://arxiv.org/pdf/1806.05842v3.pdf
PWC	https://paperswithcode.com/paper/real-time-monocular-visual-odometry-for
Repo
Framework

Adversarial Label Learning


Title	Adversarial Label Learning
Authors	Chidubem Arachie, Bert Huang
Abstract	We consider the task of training classifiers without labels. We propose a weakly supervised method—adversarial label learning—that trains classifiers to perform well against an adversary that chooses labels for training data. The weak supervision constrains what labels the adversary can choose. The method therefore minimizes an upper bound of the classifier’s error rate using projected primal-dual subgradient descent. Minimizing this bound protects against bias and dependencies in the weak supervision. Experiments on three real datasets show that our method can train without labels and outperforms other approaches for weakly supervised learning.
Tasks
Published	2018-05-22
URL	http://arxiv.org/abs/1805.08877v3
PDF	http://arxiv.org/pdf/1805.08877v3.pdf
PWC	https://paperswithcode.com/paper/adversarial-label-learning
Repo
Framework

A Proposal of Interactive Growing Hierarchical SOM


Title	A Proposal of Interactive Growing Hierarchical SOM
Authors	Takumi Ichimura, Takashi Yamaguchi
Abstract	Self Organizing Map is trained using unsupervised learning to produce a two-dimensional discretized representation of input space of the training cases. Growing Hierarchical SOM is an architecture which grows both in a hierarchical way representing the structure of data distribution and in a horizontal way representation the size of each individual maps. The control method of the growing degree of GHSOM by pruning off the redundant branch of hierarchy in SOM is proposed in this paper. Moreover, the interface tool for the proposed method called interactive GHSOM is developed. We discuss the computation results of Iris data by using the developed tool.
Tasks
Published	2018-04-08
URL	http://arxiv.org/abs/1804.02620v1
PDF	http://arxiv.org/pdf/1804.02620v1.pdf
PWC	https://paperswithcode.com/paper/a-proposal-of-interactive-growing
Repo
Framework

End-to-End Sound Source Separation Conditioned On Instrument Labels


Title	End-to-End Sound Source Separation Conditioned On Instrument Labels
Authors	Olga Slizovskaia, Leo Kim, Gloria Haro, Emilia Gomez
Abstract	Can we perform an end-to-end music source separation with a variable number of sources using a deep learning model? We present an extension of the Wave-U-Net model which allows end-to-end monaural source separation with a non-fixed number of sources. Furthermore, we propose multiplicative conditioning with instrument labels at the bottleneck of the Wave-U-Net and show its effect on the separation results. This approach leads to other types of conditioning such as audio-visual source separation and score-informed source separation.
Tasks	Music Source Separation
Published	2018-11-05
URL	https://arxiv.org/abs/1811.01850v2
PDF	https://arxiv.org/pdf/1811.01850v2.pdf
PWC	https://paperswithcode.com/paper/end-to-end-sound-source-separation
Repo
Framework

Video Smoke Detection Based on Deep Saliency Network


Title	Video Smoke Detection Based on Deep Saliency Network
Authors	Gao Xu, Yongming Zhang, Qixing Zhang, Gaohua Lin, Zhong Wang, Yang Jia, Jinjun Wang
Abstract	Video smoke detection is a promising fire detection method especially in open or large spaces and outdoor environments. Traditional video smoke detection methods usually consist of candidate region extraction and classification, but lack powerful characterization for smoke. In this paper, we propose a novel video smoke detection method based on deep saliency network. Visual saliency detection aims to highlight the most important object regions in an image. The pixel-level and object-level salient convolutional neural networks are combined to extract the informative smoke saliency map. An end-to-end framework for salient smoke detection and existence prediction of smoke is proposed for application in video smoke detection. The deep feature map is combined with the saliency map to predict the existence of smoke in an image. Initial and augmented dataset are built to measure the performance of frameworks with different design strategies. Qualitative and quantitative analysis at frame-level and pixel-level demonstrate the excellent performance of the ultimate framework.
Tasks	Saliency Detection
Published	2018-09-08
URL	http://arxiv.org/abs/1809.02802v2
PDF	http://arxiv.org/pdf/1809.02802v2.pdf
PWC	https://paperswithcode.com/paper/video-smoke-detection-based-on-deep-saliency
Repo
Framework

Neural Image Compression for Gigapixel Histopathology Image Analysis


Title	Neural Image Compression for Gigapixel Histopathology Image Analysis
Authors	David Tellez, Geert Litjens, Jeroen van der Laak, Francesco Ciompi
Abstract	We present Neural Image Compression (NIC), a method to reduce the size of gigapixel images by mapping them to a compact latent space using neural networks. We show that this compression allows us to train convolutional neural networks on histopathology whole-slide images end-to-end using weak image-level labels.
Tasks	Image Compression
Published	2018-11-07
URL	http://arxiv.org/abs/1811.02840v1
PDF	http://arxiv.org/pdf/1811.02840v1.pdf
PWC	https://paperswithcode.com/paper/neural-image-compression-for-gigapixel
Repo
Framework

Aesthetic-based Clothing Recommendation


Title	Aesthetic-based Clothing Recommendation
Authors	Wenhui Yu, Huidi Zhang, Xiangnan He, Xu Chen, Li Xiong, Zheng Qin
Abstract	Recently, product images have gained increasing attention in clothing recommendation since the visual appearance of clothing products has a significant impact on consumers’ decision. Most existing methods rely on conventional features to represent an image, such as the visual features extracted by convolutional neural networks (CNN features) and the scale-invariant feature transform algorithm (SIFT features), color histograms, and so on. Nevertheless, one important type of features, the \emph{aesthetic features}, is seldom considered. It plays a vital role in clothing recommendation since a users’ decision depends largely on whether the clothing is in line with her aesthetics, however the conventional image features cannot portray this directly. To bridge this gap, we propose to introduce the aesthetic information, which is highly relevant with user preference, into clothing recommender systems. To achieve this, we first present the aesthetic features extracted by a pre-trained neural network, which is a brain-inspired deep structure trained for the aesthetic assessment task. Considering that the aesthetic preference varies significantly from user to user and by time, we then propose a new tensor factorization model to incorporate the aesthetic features in a personalized manner. We conduct extensive experiments on real-world datasets, which demonstrate that our approach can capture the aesthetic preference of users and significantly outperform several state-of-the-art recommendation methods.
Tasks	Recommendation Systems
Published	2018-09-16
URL	http://arxiv.org/abs/1809.05822v1
PDF	http://arxiv.org/pdf/1809.05822v1.pdf
PWC	https://paperswithcode.com/paper/aesthetic-based-clothing-recommendation
Repo
Framework

Deep neural networks algorithms for stochastic control problems on finite horizon: numerical applications


Title	Deep neural networks algorithms for stochastic control problems on finite horizon: numerical applications
Authors	Achref Bachouch, Côme Huré, Nicolas Langrené, Huyen Pham
Abstract	This paper presents several numerical applications of deep learning-based algorithms that have been introduced in [HPBL18]. Numerical and comparative tests using TensorFlow illustrate the performance of our different algorithms, namely control learning by performance iteration (algorithms NNcontPI and ClassifPI), control learning by hybrid iteration (algorithms Hybrid-Now and Hybrid-LaterQ), on the 100-dimensional nonlinear PDEs examples from [EHJ17] and on quadratic backward stochastic differential equations as in [CR16]. We also performed tests on low-dimension control problems such as an option hedging problem in finance, as well as energy storage problems arising in the valuation of gas storage and in microgrid management. Numerical results and comparisons to quantization-type algorithms Qknn, as an efficient algorithm to numerically solve low-dimensional control problems, are also provided; and some corresponding codes are available on https://github.com/comeh/.
Tasks	Quantization
Published	2018-12-13
URL	https://arxiv.org/abs/1812.05916v3
PDF	https://arxiv.org/pdf/1812.05916v3.pdf
PWC	https://paperswithcode.com/paper/deep-neural-networks-algorithms-for
Repo
Framework

The Loss Surface of XOR Artificial Neural Networks


Title	The Loss Surface of XOR Artificial Neural Networks
Authors	Dhagash Mehta, Xiaojun Zhao, Edgar A. Bernal, David J. Wales
Abstract	Training an artificial neural network involves an optimization process over the landscape defined by the cost (loss) as a function of the network parameters. We explore these landscapes using optimisation tools developed for potential energy landscapes in molecular science. The number of local minima and transition states (saddle points of index one), as well as the ratio of transition states to minima, grow rapidly with the number of nodes in the network. There is also a strong dependence on the regularisation parameter, with the landscape becoming more convex (fewer minima) as the regularisation term increases. We demonstrate that in our formulation, stationary points for networks with $N_h$ hidden nodes, including the minimal network required to fit the XOR data, are also stationary points for networks with $N_{h} +1$ hidden nodes when all the weights involving the additional nodes are zero. Hence, smaller networks optimized to train the XOR data are embedded in the landscapes of larger networks. Our results clarify certain aspects of the classification and sensitivity (to perturbations in the input data) of minima and saddle points for this system, and may provide insight into dropout and network compression.
Tasks
Published	2018-04-06
URL	http://arxiv.org/abs/1804.02411v1
PDF	http://arxiv.org/pdf/1804.02411v1.pdf
PWC	https://paperswithcode.com/paper/the-loss-surface-of-xor-artificial-neural
Repo
Framework

Provable Convex Co-clustering of Tensors


Title	Provable Convex Co-clustering of Tensors
Authors	Eric C. Chi, Brian R. Gaines, Will Wei Sun, Hua Zhou, Jian Yang
Abstract	Cluster analysis is a fundamental tool for pattern discovery of complex heterogeneous data. Prevalent clustering methods mainly focus on vector or matrix-variate data and are not applicable to general-order tensors, which arise frequently in modern scientific and business applications. Moreover, there is a gap between statistical guarantees and computational efficiency for existing tensor clustering solutions due to the nature of their non-convex formulations. In this work, we bridge this gap by developing a provable convex formulation of tensor co-clustering. Our convex co-clustering (CoCo) estimator enjoys stability guarantees and is both computationally and storage efficient. We further establish a non-asymptotic error bound for the CoCo estimator, which reveals a surprising “blessing of dimensionality” phenomenon that does not exist in vector or matrix-variate cluster analysis. Our theoretical findings are supported by extensive simulated studies. Finally, we apply the CoCo estimator to the cluster analysis of advertisement click tensor data from a major online company. Our clustering results provide meaningful business insights to improve advertising effectiveness.
Tasks
Published	2018-03-17
URL	http://arxiv.org/abs/1803.06518v1
PDF	http://arxiv.org/pdf/1803.06518v1.pdf
PWC	https://paperswithcode.com/paper/provable-convex-co-clustering-of-tensors
Repo
Framework

Submodularity-Inspired Data Selection for Goal-Oriented Chatbot Training Based on Sentence Embeddings


Title	Submodularity-Inspired Data Selection for Goal-Oriented Chatbot Training Based on Sentence Embeddings
Authors	Mladen Dimovski, Claudiu Musat, Vladimir Ilievski, Andreea Hossmann, Michael Baeriswyl
Abstract	Spoken language understanding (SLU) systems, such as goal-oriented chatbots or personal assistants, rely on an initial natural language understanding (NLU) module to determine the intent and to extract the relevant information from the user queries they take as input. SLU systems usually help users to solve problems in relatively narrow domains and require a large amount of in-domain training data. This leads to significant data availability issues that inhibit the development of successful systems. To alleviate this problem, we propose a technique of data selection in the low-data regime that enables us to train with fewer labeled sentences, thus smaller labelling costs. We propose a submodularity-inspired data ranking function, the ratio-penalty marginal gain, for selecting data points to label based only on the information extracted from the textual embedding space. We show that the distances in the embedding space are a viable source of information that can be used for data selection. Our method outperforms two known active learning techniques and enables cost-efficient training of the NLU unit. Moreover, our proposed selection technique does not need the model to be retrained in between the selection steps, making it time efficient as well.
Tasks	Active Learning, Chatbot, Sentence Embeddings, Spoken Language Understanding
Published	2018-02-02
URL	http://arxiv.org/abs/1802.00757v2
PDF	http://arxiv.org/pdf/1802.00757v2.pdf
PWC	https://paperswithcode.com/paper/submodularity-inspired-data-selection-for
Repo
Framework

Recognition of Offline Handwritten Devanagari Numerals using Regional Weighted Run Length Features


Title	Recognition of Offline Handwritten Devanagari Numerals using Regional Weighted Run Length Features
Authors	Pawan Kumar Singh, Supratim Das, Ram Sarkar, Mita Nasipuri
Abstract	Recognition of handwritten Roman characters and numerals has been extensively studied in the last few decades and its accuracy reached to a satisfactory state. But the same cannot be said while talking about the Devanagari script which is one of most popular script in India. This paper proposes an efficient digit recognition system for handwritten Devanagari script. The system uses a novel 196-element Mask Oriented Directional (MOD) features for the recognition purpose. The methodology is tested using five conventional classifiers on 6000 handwritten digit samples. On applying 3-fold cross-validation scheme, the proposed system yields the highest recognition accuracy of 95.02% using Support Vector Machine (SVM) classifier.
Tasks
Published	2018-06-29
URL	http://arxiv.org/abs/1806.11517v1
PDF	http://arxiv.org/pdf/1806.11517v1.pdf
PWC	https://paperswithcode.com/paper/recognition-of-offline-handwritten-devanagari
Repo
Framework

Fine-Grained Classification of Cervical Cells Using Morphological and Appearance Based Convolutional Neural Networks


Title	Fine-Grained Classification of Cervical Cells Using Morphological and Appearance Based Convolutional Neural Networks
Authors	Haoming Lin, Yuyang Hu, Siping Chen, Jianhua Yao, Ling Zhang
Abstract	Fine-grained classification of cervical cells into different abnormality levels is of great clinical importance but remains very challenging. Contrary to traditional classification methods that rely on hand-crafted or engineered features, convolution neural network (CNN) can classify cervical cells based on automatically learned deep features. However, CNN in previous studies do not involve cell morphological information, and it is unknown whether morphological features can be directly modeled by CNN to classify cervical cells. This paper presents a CNN-based method that combines cell image appearance with cell morphology for classification of cervical cells in Pap smear. The training cervical cell dataset consists of adaptively re-sampled image patches coarsely centered on the nuclei. Several CNN models (AlexNet, GoogleNet, ResNet and DenseNet) pre-trained on ImageNet dataset were fine-tuned on the cervical dataset for comparison. The proposed method is evaluated on the Herlev cervical dataset by five-fold cross-validation at patient level splitting. Results show that by adding cytoplasm and nucleus masks as raw morphological information into appearance-based CNN learning, higher classification accuracies can be achieved in general. Among the four CNN models, GoogleNet fed with both morphological and appearance information obtains the highest classification accuracies of 94.5% for 2-class classification task and 64.5% for 7-class classification task. Our method demonstrates that combining cervical cell morphology with appearance information can provide improved classification performance, which is clinically important for early diagnosis of cervical dysplastic changes.
Tasks
Published	2018-10-14
URL	http://arxiv.org/abs/1810.06058v1
PDF	http://arxiv.org/pdf/1810.06058v1.pdf
PWC	https://paperswithcode.com/paper/fine-grained-classification-of-cervical-cells
Repo
Framework

The linear hidden subset problem for the (1+1) EA with scheduled and adaptive mutation rates


Title	The linear hidden subset problem for the (1+1) EA with scheduled and adaptive mutation rates
Authors	Hafsteinn Einarsson, Marcelo Matheus Gauy, Johannes Lengler, Florian Meier, Asier Mujika, Angelika Steger, Felix Weissenberger
Abstract	We study unbiased $(1+1)$ evolutionary algorithms on linear functions with an unknown number $n$ of bits with non-zero weight. Static algorithms achieve an optimal runtime of $O(n (\ln n)^{2+\epsilon})$, however, it remained unclear whether more dynamic parameter policies could yield better runtime guarantees. We consider two setups: one where the mutation rate follows a fixed schedule, and one where it may be adapted depending on the history of the run. For the first setup, we give a schedule that achieves a runtime of $(1\pm o(1))\beta n \ln n$, where $\beta \approx 3.552$, which is an asymptotic improvement over the runtime of the static setup. Moreover, we show that no schedule admits a better runtime guarantee and that the optimal schedule is essentially unique. For the second setup, we show that the runtime can be further improved to $(1\pm o(1)) e n \ln n$, which matches the performance of algorithms that know $n$ in advance. Finally, we study the related model of initial segment uncertainty with static position-dependent mutation rates, and derive asymptotically optimal lower bounds. This answers a question by Doerr, Doerr, and K"otzing.
Tasks
Published	2018-08-16
URL	http://arxiv.org/abs/1808.05566v1
PDF	http://arxiv.org/pdf/1808.05566v1.pdf
PWC	https://paperswithcode.com/paper/the-linear-hidden-subset-problem-for-the-11
Repo
Framework

Computing Word Classes Using Spectral Clustering


Title	Computing Word Classes Using Spectral Clustering
Authors	Effi Levi, Saggy Herman, Ari Rappoport
Abstract	Clustering a lexicon of words is a well-studied problem in natural language processing (NLP). Word clusters are used to deal with sparse data in statistical language processing, as well as features for solving various NLP tasks (text categorization, question answering, named entity recognition and others). Spectral clustering is a widely used technique in the field of image processing and speech recognition. However, it has scarcely been explored in the context of NLP; specifically, the method used in this (Meila and Shi, 2001) has never been used to cluster a general word lexicon. We apply spectral clustering to a lexicon of words, evaluating the resulting clusters by using them as features for solving two classical NLP tasks: semantic role labeling and dependency parsing. We compare performance with Brown clustering, a widely-used technique for word clustering, as well as with other clustering methods. We show that spectral clusters produce similar results to Brown clusters, and outperform other clustering methods. In addition, we quantify the overlap between spectral and Brown clusters, showing that each model captures some information which is uncaptured by the other.
Tasks	Dependency Parsing, Named Entity Recognition, Question Answering, Semantic Role Labeling, Speech Recognition, Text Categorization
Published	2018-08-16
URL	http://arxiv.org/abs/1808.05374v1
PDF	http://arxiv.org/pdf/1808.05374v1.pdf
PWC	https://paperswithcode.com/paper/computing-word-classes-using-spectral
Repo
Framework