January 29, 2020

3191 words 15 mins read

Paper Group ANR 613

Automatic Analysis of Sewer Pipes Based on Unrolled Monocular Fisheye Images. Spatio-Temporal Convolutional LSTMs for Tumor Growth Prediction by Learning 4D Longitudinal Patient Data. Automatic Detection and Compression for Passive Acoustic Monitoring of the African Forest Elephant. Evolution of Ant Colony Optimization Algorithm – A Brief Literatu …

Automatic Analysis of Sewer Pipes Based on Unrolled Monocular Fisheye Images


Title	Automatic Analysis of Sewer Pipes Based on Unrolled Monocular Fisheye Images
Authors	Johannes Künzel, Thomas Werner, Ronja Möller, Peter Eisert, Jan Waschnewski, Ralf Hilpert
Abstract	The task of detecting and classifying damages in sewer pipes offers an important application area for computer vision algorithms. This paper describes a system, which is capable of accomplishing this task solely based on low quality and severely compressed fisheye images from a pipe inspection robot. Relying on robust image features, we estimate camera poses, model the image lighting, and exploit this information to generate high quality cylindrical unwraps of the pipes’ surfaces.Based on the generated images, we apply semantic labeling based on deep convolutional neural networks to detect and classify defects as well as structural elements.
Tasks
Published	2019-12-11
URL	https://arxiv.org/abs/1912.05222v1
PDF	https://arxiv.org/pdf/1912.05222v1.pdf
PWC	https://paperswithcode.com/paper/automatic-analysis-of-sewer-pipes-based-on
Repo
Framework

Spatio-Temporal Convolutional LSTMs for Tumor Growth Prediction by Learning 4D Longitudinal Patient Data


Title	Spatio-Temporal Convolutional LSTMs for Tumor Growth Prediction by Learning 4D Longitudinal Patient Data
Authors	Ling Zhang, Le Lu, Xiaosong Wang, Robert M. Zhu, Mohammadhadi Bagheri, Ronald M. Summers, Jianhua Yao
Abstract	Prognostic tumor growth modeling via volumetric medical imaging observations can potentially lead to better outcomes of tumor treatment and surgical planning. Recent advances of convolutional networks have demonstrated higher accuracy than traditional mathematical models in predicting future tumor volumes. This indicates that deep learning-based techniques may have great potentials on addressing such problem. However, current 2D patch-based modeling approaches cannot make full use of the spatio-temporal imaging context of the tumor’s longitudinal 4D (3D + time) data. Moreover, they are incapable to predict clinically-relevant tumor properties, other than volumes. In this paper, we exploit to formulate the tumor growth process through convolutional Long Short-Term Memory (ConvLSTM) that extract tumor’s static imaging appearances and capture its temporal dynamic changes within a single network. We extend ConvLSTM into the spatio-temporal domain (ST-ConvLSTM) by jointly learning the inter-slice 3D contexts and the longitudinal or temporal dynamics from multiple patient studies. Our approach can incorporate other non-imaging patient information in an end-to-end trainable manner. Experiments are conducted on the largest 4D longitudinal tumor dataset of 33 patients to date. Results validate that the ST-ConvLSTM produces a Dice score of 83.2%+-5.1% and a RVD of 11.2%+-10.8%, both significantly outperforming (p<0.05) other compared methods of linear model, ConvLSTM, and generative adversarial network (GAN) under the metric of predicting future tumor volumes. Additionally, our new method enables the prediction of both cell density and CT intensity numbers. Last, we demonstrate the generalizability of ST-ConvLSTM by employing it in 4D medical image segmentation task, which achieves an averaged Dice score of 86.3+-1.2% for left-ventricle segmentation in 4D ultrasound with 3 seconds per patient.
Tasks	Medical Image Segmentation, Semantic Segmentation
Published	2019-02-23
URL	https://arxiv.org/abs/1902.08716v2
PDF	https://arxiv.org/pdf/1902.08716v2.pdf
PWC	https://paperswithcode.com/paper/spatial-temporal-convolutional-lstms-for
Repo
Framework

Automatic Detection and Compression for Passive Acoustic Monitoring of the African Forest Elephant


Title	Automatic Detection and Compression for Passive Acoustic Monitoring of the African Forest Elephant
Authors	Johan Bjorck, Brendan H. Rappazzo, Di Chen, Richard Bernstein, Peter H. Wrege, Carla P. Gomes
Abstract	In this work, we consider applying machine learning to the analysis and compression of audio signals in the context of monitoring elephants in sub-Saharan Africa. Earth’s biodiversity is increasingly under threat by sources of anthropogenic change (e.g. resource extraction, land use change, and climate change) and surveying animal populations is critical for developing conservation strategies. However, manually monitoring tropical forests or deep oceans is intractable. For species that communicate acoustically, researchers have argued for placing audio recorders in the habitats as a cost-effective and non-invasive method, a strategy known as passive acoustic monitoring (PAM). In collaboration with conservation efforts, we construct a large labeled dataset of passive acoustic recordings of the African Forest Elephant via crowdsourcing, compromising thousands of hours of recordings in the wild. Using state-of-the-art techniques in artificial intelligence we improve upon previously proposed methods for passive acoustic monitoring for classification and segmentation. In real-time detection of elephant calls, network bandwidth quickly becomes a bottleneck and efficient ways to compress the data are needed. Most audio compression schemes are aimed at human listeners and are unsuitable for low-frequency elephant calls. To remedy this, we provide a novel end-to-end differentiable method for compression of audio signals that can be adapted to acoustic monitoring of any species and dramatically improves over naive coding strategies.
Tasks
Published	2019-02-25
URL	http://arxiv.org/abs/1902.09069v1
PDF	http://arxiv.org/pdf/1902.09069v1.pdf
PWC	https://paperswithcode.com/paper/automatic-detection-and-compression-for
Repo
Framework

Evolution of Ant Colony Optimization Algorithm – A Brief Literature Review


Title	Evolution of Ant Colony Optimization Algorithm – A Brief Literature Review
Authors	Aleem Akhtar
Abstract	Ant Colony Optimization (ACO) is a metaheuristic proposed by Marco Dorigo in 1991 based on behavior of biological ants. Pheromone laying and selection of shortest route with the help of pheromone inspired development of first ACO algorithm. Since, presentation of first such algorithm, many researchers have worked and published their research in this field. Though initial results were not so promising but recent developments have made this metaheuristic a significant algorithm in Swarm Intelligence. This research presents a brief overview of recent developments carried out in ACO algorithms in terms of both applications and algorithmic developments. For application developments, multi-objective optimization, continuous optimization and time-varying NP-hard problems have been presented. While to review articles based on algorithmic development, hybridization and parallel architectures have been investigated.
Tasks
Published	2019-08-15
URL	https://arxiv.org/abs/1908.08007v2
PDF	https://arxiv.org/pdf/1908.08007v2.pdf
PWC	https://paperswithcode.com/paper/190808007
Repo
Framework

Autoencoding with a Learning Classifier System: Initial Results


Title	Autoencoding with a Learning Classifier System: Initial Results
Authors	Larry Bull
Abstract	Autoencoders enable data dimensionality reduction and a key component of many (deep) learning systems. This short paper introduces a form of Holland’s Learning Classifier System (LCS) to perform autoencoding building upon a previously presented form of LCS that utilises unsupervised learning for clustering. Initial results using a neural network representation suggest it is an effective approach to reduction.
Tasks	Dimensionality Reduction
Published	2019-07-26
URL	https://arxiv.org/abs/1907.11554v2
PDF	https://arxiv.org/pdf/1907.11554v2.pdf
PWC	https://paperswithcode.com/paper/autoencoding-with-a-learning-classifier
Repo
Framework

Boundary-weighted Domain Adaptive Neural Network for Prostate MR Image Segmentation


Title	Boundary-weighted Domain Adaptive Neural Network for Prostate MR Image Segmentation
Authors	Qikui Zhu, Bo Du, Pingkun Yan
Abstract	Accurate segmentation of the prostate from magnetic resonance (MR) images provides useful information for prostate cancer diagnosis and treatment. However, automated prostate segmentation from 3D MR images still faces several challenges. For instance, a lack of clear edge between the prostate and other anatomical structures makes it challenging to accurately extract the boundaries. The complex background texture and large variation in size, shape and intensity distribution of the prostate itself make segmentation even further complicated. With deep learning, especially convolutional neural networks (CNNs), emerging as commonly used methods for medical image segmentation, the difficulty in obtaining large number of annotated medical images for training CNNs has become much more pronounced that ever before. Since large-scale dataset is one of the critical components for the success of deep learning, lack of sufficient training data makes it difficult to fully train complex CNNs. To tackle the above challenges, in this paper, we propose a boundary-weighted domain adaptive neural network (BOWDA-Net). To make the network more sensitive to the boundaries during segmentation, a boundary-weighted segmentation loss (BWL) is proposed. Furthermore, an advanced boundary-weighted transfer leaning approach is introduced to address the problem of small medical imaging datasets. We evaluate our proposed model on the publicly available MICCAI 2012 Prostate MR Image Segmentation (PROMISE12) challenge dataset. Our experimental results demonstrate that the proposed model is more sensitive to boundary information and outperformed other state-of-the-art methods.
Tasks	Medical Image Segmentation, Semantic Segmentation
Published	2019-02-21
URL	https://arxiv.org/abs/1902.08128v2
PDF	https://arxiv.org/pdf/1902.08128v2.pdf
PWC	https://paperswithcode.com/paper/boundary-weighted-domain-adaptive-neural
Repo
Framework

High Dimensional Bayesian Optimization via Supervised Dimension Reduction


Title	High Dimensional Bayesian Optimization via Supervised Dimension Reduction
Authors	Miao Zhang, Huiqi Li, Steven Su
Abstract	Bayesian optimization (BO) has been broadly applied to computational expensive problems, but it is still challenging to extend BO to high dimensions. Existing works are usually under strict assumption of an additive or a linear embedding structure for objective functions. This paper directly introduces a supervised dimension reduction method, Sliced Inverse Regression (SIR), to high dimensional Bayesian optimization, which could effectively learn the intrinsic sub-structure of objective function during the optimization. Furthermore, a kernel trick is developed to reduce computational complexity and learn nonlinear subset of the unknowing function when applying SIR to extremely high dimensional BO. We present several computational benefits and derive theoretical regret bounds of our algorithm. Extensive experiments on synthetic examples and two real applications demonstrate the superiority of our algorithms for high dimensional Bayesian optimization.
Tasks	Dimensionality Reduction
Published	2019-07-21
URL	https://arxiv.org/abs/1907.08953v1
PDF	https://arxiv.org/pdf/1907.08953v1.pdf
PWC	https://paperswithcode.com/paper/high-dimensional-bayesian-optimization-via-1
Repo
Framework

Faster width-dependent algorithm for mixed packing and covering LPs


Title	Faster width-dependent algorithm for mixed packing and covering LPs
Authors	Digvijay Boob, Saurabh Sawlani, Di Wang
Abstract	In this paper, we give a faster width-dependent algorithm for mixed packing-covering LPs. Mixed packing-covering LPs are fundamental to combinatorial optimization in computer science and operations research. Our algorithm finds a $1+\eps$ approximate solution in time $O(Nw/ \eps)$, where $N$ is number of nonzero entries in the constraint matrix and $w$ is the maximum number of nonzeros in any constraint. This run-time is better than Nesterov’s smoothing algorithm which requires $O(N\sqrt{n}w/ \eps)$ where $n$ is the dimension of the problem. Our work utilizes the framework of area convexity introduced in [Sherman-FOCS’17] to obtain the best dependence on $\eps$ while breaking the infamous $\ell_{\infty}$ barrier to eliminate the factor of $\sqrt{n}$. The current best width-independent algorithm for this problem runs in time $O(N/\eps^2)$ [Young-arXiv-14] and hence has worse running time dependence on $\eps$. Many real life instances of the mixed packing-covering problems exhibit small width and for such cases, our algorithm can report higher precision results when compared to width-independent algorithms. As a special case of our result, we report a $1+\eps$ approximation algorithm for the densest subgraph problem which runs in time $O(md/ \eps)$, where $m$ is the number of edges in the graph and $d$ is the maximum graph degree.
Tasks	Combinatorial Optimization
Published	2019-09-26
URL	https://arxiv.org/abs/1909.12387v1
PDF	https://arxiv.org/pdf/1909.12387v1.pdf
PWC	https://paperswithcode.com/paper/faster-width-dependent-algorithm-for-mixed
Repo
Framework

Autonomous Industrial Management via Reinforcement Learning: Self-Learning Agents for Decision-Making – A Review


Title	Autonomous Industrial Management via Reinforcement Learning: Self-Learning Agents for Decision-Making – A Review
Authors	Leonardo A. Espinosa Leal, Magnus Westerlund, Anthony Chapman
Abstract	Industry has always been in the pursuit of becoming more economically efficient and the current focus has been to reduce human labour using modern technologies. Even with cutting edge technologies, which range from packaging robots to AI for fault detection, there is still some ambiguity on the aims of some new systems, namely, whether they are automated or autonomous. In this paper we indicate the distinctions between automated and autonomous system as well as review the current literature and identify the core challenges for creating learning mechanisms of autonomous agents. We discuss using different types of extended realities, such as digital twins, to train reinforcement learning agents to learn specific tasks through generalization. Once generalization is achieved, we discuss how these can be used to develop self-learning agents. We then introduce self-play scenarios and how they can be used to teach self-learning agents through a supportive environment which focuses on how the agents can adapt to different real-world environments.
Tasks	Decision Making, Fault Detection
Published	2019-10-20
URL	https://arxiv.org/abs/1910.08942v1
PDF	https://arxiv.org/pdf/1910.08942v1.pdf
PWC	https://paperswithcode.com/paper/autonomous-industrial-management-via
Repo
Framework

Compressed Subspace Learning Based on Canonical Angle Preserving Property


Title	Compressed Subspace Learning Based on Canonical Angle Preserving Property
Authors	Yuchen Jiao, Gen Li, Yuantao Gu
Abstract	Union of Subspaces (UoS) is a popular model to describe the underlying low-dimensional structure of data. The fine details of UoS structure can be described in terms of canonical angles (also known as principal angles) between subspaces, which is a well-known characterization for relative subspace positions. In this paper, we prove that random projection with the so-called Johnson-Lindenstrauss (JL) property approximately preserves canonical angles between subspaces with overwhelming probability. This result indicates that random projection approximately preserves the UoS structure. Inspired by this result, we propose a framework of Compressed Subspace Learning (CSL), which enables to extract useful information from the UoS structure of data in a greatly reduced dimension. We demonstrate the effectiveness of CSL in various subspace-related tasks such as subspace visualization, active subspace detection, and subspace clustering.
Tasks	Dimensionality Reduction
Published	2019-07-14
URL	https://arxiv.org/abs/1907.06166v2
PDF	https://arxiv.org/pdf/1907.06166v2.pdf
PWC	https://paperswithcode.com/paper/compressed-subspace-learning-based-on
Repo
Framework

MetaPix: Few-Shot Video Retargeting


Title	MetaPix: Few-Shot Video Retargeting
Authors	Jessica Lee, Deva Ramanan, Rohit Girdhar
Abstract	We address the task of unsupervised retargeting of human actions from one video to another. We consider the challenging setting where only a few frames of the target is available. The core of our approach is a conditional generative model that can transcode input skeletal poses (automatically extracted with an off-the-shelf pose estimator) to output target frames. However, it is challenging to build a universal transcoder because humans can appear wildly different due to clothing and background scene geometry. Instead, we learn to adapt - or personalize - a universal generator to the particular human and background in the target. To do so, we make use of meta-learning to discover effective strategies for on-the-fly personalization. One significant benefit of meta-learning is that the personalized transcoder naturally enforces temporal coherence across its generated frames; all frames contain consistent clothing and background geometry of the target. We experiment on in-the-wild internet videos and images and show our approach improves over widely-used baselines for the task.
Tasks	Meta-Learning
Published	2019-10-10
URL	https://arxiv.org/abs/1910.04742v2
PDF	https://arxiv.org/pdf/1910.04742v2.pdf
PWC	https://paperswithcode.com/paper/metapix-few-shot-video-retargeting
Repo
Framework

Gradient-based training of Gaussian Mixture Models in High-Dimensional Spaces


Title	Gradient-based training of Gaussian Mixture Models in High-Dimensional Spaces
Authors	Alexander Gepperth, Benedikt Pfülb
Abstract	We present an approach for efficiently training Gaussian Mixture Models (GMMs) with Stochastic Gradient Descent (SGD) on large amounts of high-dimensional data (e.g., images). In such a scenario, SGD is strongly superior in terms of execution time and memory usage, although it is conceptually more complex than the traditional Expectation-Maximization (EM) algorithm. For enabling SGD training, we propose three novel ideas: First, we show that minimizing an upper bound to the GMM log likelihood instead of the full one is feasible and numerically much more stable way in high-dimensional spaces. Secondly, we propose a new annealing procedure that prevents SGD from converging to pathological local minima. We also propose an SGD-compatible simplification to the full GMM model based on local principal directions, which avoids excessive memory use in high-dimensional spaces due to quadratic growth of covariance matrices. Experiments on several standard image datasets show the validity of our approach, and we provide a publicly available TensorFlow implementation.
Tasks
Published	2019-12-18
URL	https://arxiv.org/abs/1912.09379v1
PDF	https://arxiv.org/pdf/1912.09379v1.pdf
PWC	https://paperswithcode.com/paper/gradient-based-training-of-gaussian-mixture-1
Repo
Framework

Fast Video Retargeting Based on Seam Carving with Parental Labeling


Title	Fast Video Retargeting Based on Seam Carving with Parental Labeling
Authors	Zhu Chuning
Abstract	Seam carving is a state-of-the-art content-aware image resizing technique that effectively preserves the salient areas of an image. However, when applied to video retargeting, not only is it time intensive, but it also creates highly visible frame-wise discontinuities. In this paper, we propose a novel video retargeting method based on seam carving. First, for a single frame, we locate and remove several seams instead of one seam at once. Second, we use a dynamic spatiotemporal buffer of energy maps and a standard deviation operator to carve out the same seams in a temporal cube of frames with low variation in energy. Last but not least, an improved energy function that considers motions detected through difference method is employed. During testing, these enhancements result in a 93 percent reduction in processing time and a higher frame-wise consistency, thus showing superior performance compared to existing video retargeting methods.
Tasks
Published	2019-03-07
URL	http://arxiv.org/abs/1903.03180v1
PDF	http://arxiv.org/pdf/1903.03180v1.pdf
PWC	https://paperswithcode.com/paper/fast-video-retargeting-based-on-seam-carving
Repo
Framework

3G structure for image caption generation


Title	3G structure for image caption generation
Authors	Aihong Yuan, Xuelong Li, Xiaoqiang Lu
Abstract	It is a big challenge of computer vision to make machine automatically describe the content of an image with a natural language sentence. Previous works have made great progress on this task, but they only use the global or local image feature, which may lose some important subtle or global information of an image. In this paper, we propose a model with 3-gated model which fuses the global and local image features together for the task of image caption generation. The model mainly has three gated structures. 1) Gate for the global image feature, which can adaptively decide when and how much the global image feature should be imported into the sentence generator. 2) The gated recurrent neural network (RNN) is used as the sentence generator. 3) The gated feedback method for stacking RNN is employed to increase the capability of nonlinearity fitting. More specially, the global and local image features are combined together in this paper, which makes full use of the image information. The global image feature is controlled by the first gate and the local image feature is selected by the attention mechanism. With the latter two gates, the relationship between image and text can be well explored, which improves the performance of the language part as well as the multi-modal embedding part. Experimental results show that our proposed method outperforms the state-of-the-art for image caption generation.
Tasks
Published	2019-04-21
URL	http://arxiv.org/abs/1904.09544v1
PDF	http://arxiv.org/pdf/1904.09544v1.pdf
PWC	https://paperswithcode.com/paper/3g-structure-for-image-caption-generation
Repo
Framework

Fix Your Features: Stationary and Maximally Discriminative Embeddings using Regular Polytope (Fixed Classifier) Networks


Title	Fix Your Features: Stationary and Maximally Discriminative Embeddings using Regular Polytope (Fixed Classifier) Networks
Authors	Federico Pernici, Matteo Bruni, Claudio Baecchi, Alberto Del Bimbo
Abstract	Neural networks are widely used as a model for classification in a large variety of tasks. Typically, a learnable transformation (i.e. the classifier) is placed at the end of such models returning a value for each class used for classification. This transformation plays an important role in determining how the generated features change during the learning process. In this work we argue that this transformation not only can be fixed (i.e. set as non trainable) with no loss of accuracy, but it can also be used to learn stationary and maximally discriminative embeddings. We show that the stationarity of the embedding and its maximal discriminative representation can be theoretically justified by setting the weights of the fixed classifier to values taken from the coordinate vertices of three regular polytopes available in $\mathbb{R}^d$, namely: the $d$-Simplex, the $d$-Cube and the $d$-Orthoplex. These regular polytopes have the maximal amount of symmetry that can be exploited to generate stationary features angularly centered around their corresponding fixed weights. Our approach improves and broadens the concept of a fixed classifier, recently proposed in \cite{hoffer2018fix}, to a larger class of fixed classifier models. Experimental results confirm both the theoretical analysis and the generalization capability of the proposed method.
Tasks
Published	2019-02-27
URL	http://arxiv.org/abs/1902.10441v2
PDF	http://arxiv.org/pdf/1902.10441v2.pdf
PWC	https://paperswithcode.com/paper/fix-your-features-stationary-and-maximally
Repo
Framework