January 27, 2020

3052 words 15 mins read

Paper Group ANR 1235

Flexible Conditional Image Generation of Missing Data with Learned Mental Maps. Business Process Variant Analysis based on Mutual Fingerprints of Event Logs. How Effectively Can Indoor Wireless Positioning Relieve Visual Tracking Pains: A Camera-Rao Bound Viewpoint. SoftGAN: Learning generative models efficiently with application to CycleGAN Voice …

Flexible Conditional Image Generation of Missing Data with Learned Mental Maps


Title	Flexible Conditional Image Generation of Missing Data with Learned Mental Maps
Authors	Benjamin Hou, Athanasios Vlontzos, Amir Alansary, Daniel Rueckert, Bernhard Kainz
Abstract	Real-world settings often do not allow acquisition of high-resolution volumetric images for accurate morphological assessment and diagnostic. In clinical practice it is frequently common to acquire only sparse data (e.g. individual slices) for initial diagnostic decision making. Thereby, physicians rely on their prior knowledge (or mental maps) of the human anatomy to extrapolate the underlying 3D information. Accurate mental maps require years of anatomy training, which in the first instance relies on normative learning, i.e. excluding pathology. In this paper, we leverage Bayesian Deep Learning and environment mapping to generate full volumetric anatomy representations from none to a small, sparse set of slices. We evaluate proof of concept implementations based on Generative Query Networks (GQN) and Conditional BRUNO using abdominal CT and brain MRI as well as in a clinical application involving sparse, motion-corrupted MR acquisition for fetal imaging. Our approach allows to reconstruct 3D volumes from 1 to 4 tomographic slices, with a SSIM of 0.7+ and cross-correlation of 0.8+ compared to the 3D ground truth.
Tasks	Conditional Image Generation, Decision Making, Image Generation
Published	2019-08-29
URL	https://arxiv.org/abs/1908.11312v1
PDF	https://arxiv.org/pdf/1908.11312v1.pdf
PWC	https://paperswithcode.com/paper/flexible-conditional-image-generation-of
Repo
Framework

Business Process Variant Analysis based on Mutual Fingerprints of Event Logs


Title	Business Process Variant Analysis based on Mutual Fingerprints of Event Logs
Authors	Farbod Taymouri, Marcello La Rosa, Josep Carmona
Abstract	Comparing business process variants using event logs is a common use case in process mining. Existing techniques for process variant analysis detect statistically-significant differences between variants at the level of individual entities (such as process activities) and their relationships (e.g. directly-follows relations between activities). This may lead to a proliferation of differences due to the low level of granularity in which such differences are captured. This paper presents a novel approach to detect statistically-significant differences between variants at the level of entire process traces (i.e. sequences of directly-follows relations). The cornerstone of this approach is a technique to learn a directly follows graph called mutual fingerprint from the event logs of the two variants. A mutual fingerprint is a lossless encoding of a set of traces and their duration using discrete wavelet transformation. This structure facilitates the understanding of statistical differences along the control-flow and performance dimensions. The approach has been evaluated using real-life event logs against two baselines. The results show that at a trace level, the baselines cannot always reveal the differences discovered by our approach, or can detect spurious differences.
Tasks
Published	2019-12-23
URL	https://arxiv.org/abs/1912.10598v2
PDF	https://arxiv.org/pdf/1912.10598v2.pdf
PWC	https://paperswithcode.com/paper/business-process-variant-analysis-based-on
Repo
Framework

How Effectively Can Indoor Wireless Positioning Relieve Visual Tracking Pains: A Camera-Rao Bound Viewpoint


Title	How Effectively Can Indoor Wireless Positioning Relieve Visual Tracking Pains: A Camera-Rao Bound Viewpoint
Authors	Panwen Hu, Zizheng Yan, Rui Huang, Feng Yin
Abstract	Visual tracking is fragile in some difficult scenarios, for instance, appearance ambiguity and variation, occlusion can easily degrade most of visual trackers to some extent. In this paper, visual tracking is empowered with wireless positioning to achieve high accuracy while maintaining robustness. Fundamentally different from the previous works, this study does not involve any specific wireless positioning algorithms. Instead, we use the confidence region derived from the wireless positioning Cramer-Rao bound (CRB) as the search region of visual trackers. The proposed framework is low-cost and very simple to implement, yet readily leads to enhanced and robustified visual tracking performance in difficult scenarios as corroborated by our experimental results. Most importantly, it is utmost valuable for the practioners to pre-evaluate how effectively can the wireless resources available at hand alleviate the visual tracking pains.
Tasks	Visual Tracking
Published	2019-03-09
URL	http://arxiv.org/abs/1903.03736v1
PDF	http://arxiv.org/pdf/1903.03736v1.pdf
PWC	https://paperswithcode.com/paper/how-effectively-can-indoor-wireless
Repo
Framework

SoftGAN: Learning generative models efficiently with application to CycleGAN Voice Conversion


Title	SoftGAN: Learning generative models efficiently with application to CycleGAN Voice Conversion
Authors	Rafael Ferro, Nicolas Obin, Axel Roebel
Abstract	Voice conversion with deep neural networks has become extremely popular over the last few years with improvements over the past VC architectures. In particular, GAN architectures such as the cycleGAN and the VAEGAN have offered the possibility to learn voice conversion from non-parallel databases. However, GAN-based methods are highly unstable, requiring often a careful tuning of hyper-parameters, and can lead to poor voice identity conversion and substantially degraded converted speech signal. This paper discusses and tackles the stability issues of the GAN in the context of voice conversion. The proposed SoftGAN method aims at reducing the impact of the generator on the discriminator and vice versa during training, so both can learn more gradually and efficiently during training, in particular avoiding a training not in tandem. A subjective experiment conducted on a voice conversion task on the voice conversion challenge 2018 dataset shows that the proposed SoftGAN significantly improves the quality of the voice conversion while preserving the naturalness of the converted speech.
Tasks	Voice Conversion
Published	2019-10-22
URL	https://arxiv.org/abs/1910.12614v1
PDF	https://arxiv.org/pdf/1910.12614v1.pdf
PWC	https://paperswithcode.com/paper/softgan-learning-generative-models
Repo
Framework


Title	Blind Inpainting of Large-scale Masks of Thin Structures with Adversarial and Reinforcement Learning
Authors	Hao Chen, Mario Valerio Giuffrida, Peter Doerner, Sotirios A. Tsaftaris
Abstract	Several imaging applications (vessels, retina, plant roots, road networks from satellites) require the accurate segmentation of thin structures for subsequent analysis. Discontinuities (gaps) in the extracted foreground may hinder down-stream image-based analysis of biomarkers, organ structure and topology. In this paper, we propose a general post-processing technique to recover such gaps in large-scale segmentation masks. We cast this problem as a blind inpainting task, where the regions of missing lines in the segmentation masks are not known to the algorithm, which we solve with an adversarially trained neural network. One challenge of using large images is the memory capacity of current GPUs. The typical approach of dividing a large image into smaller patches to train the network does not guarantee global coherence of the reconstructed image that preserves structure and topology. We use adversarial training and reinforcement learning (Policy Gradient) to endow the model with both global context and local details. We evaluate our method in several datasets in medical imaging, plant science, and remote sensing. Our experiments demonstrate that our model produces the most realistic and complete inpainted results, outperforming other approaches. In a dedicated study on plant roots we find that our approach is also comparable to human performance. Implementation available at \url{https://github.com/Hhhhhhhhhhao/Thin-Structure-Inpainting}.
Tasks
Published	2019-12-05
URL	https://arxiv.org/abs/1912.02470v1
PDF	https://arxiv.org/pdf/1912.02470v1.pdf
PWC	https://paperswithcode.com/paper/blind-inpainting-of-large-scale-masks-of-thin
Repo
Framework

Risk bounds for reservoir computing


Title	Risk bounds for reservoir computing
Authors	Lukas Gonon, Lyudmila Grigoryeva, Juan-Pablo Ortega
Abstract	We analyze the practices of reservoir computing in the framework of statistical learning theory. In particular, we derive finite sample upper bounds for the generalization error committed by specific families of reservoir computing systems when processing discrete-time inputs under various hypotheses on their dependence structure. Non-asymptotic bounds are explicitly written down in terms of the multivariate Rademacher complexities of the reservoir systems and the weak dependence structure of the signals that are being handled. This allows, in particular, to determine the minimal number of observations needed in order to guarantee a prescribed estimation accuracy with high probability for a given reservoir family. At the same time, the asymptotic behavior of the devised bounds guarantees the consistency of the empirical risk minimization procedure for various hypothesis classes of reservoir functionals.
Tasks
Published	2019-10-30
URL	https://arxiv.org/abs/1910.13886v1
PDF	https://arxiv.org/pdf/1910.13886v1.pdf
PWC	https://paperswithcode.com/paper/risk-bounds-for-reservoir-computing
Repo
Framework

Data Augmentation Using Adversarial Training for Construction-Equipment Classification


Title	Data Augmentation Using Adversarial Training for Construction-Equipment Classification
Authors	Francis Baek, Somin Park, Hyoungkwan Kim
Abstract	Deep learning-based construction-site image analysis has recently made great progress with regard to accuracy and speed, but it requires a large amount of data. Acquiring sufficient amount of labeled construction-image data is a prerequisite for deep learning-based construction-image recognition and requires considerable time and effort. In this paper, we propose a “data augmentation” scheme based on generative adversarial networks (GANs) for construction-equipment classification. The proposed method combines a GAN and additional “adversarial training” to stably perform “data augmentation” for construction equipment. The “data augmentation” was verified via binary classification experiments involving excavator images, and the average accuracy improvement was 4.094%. In the experiment, three image sizes (32-32-3, 64-64-3, and 128-128-3) and 120, 240, and 480 training samples were used to demonstrate the robustness of the proposed method. These results demonstrated that the proposed method can effectively and reliably generate construction-equipment images and train deep learning-based classifiers for construction equipment.
Tasks	Data Augmentation
Published	2019-11-27
URL	https://arxiv.org/abs/1911.11916v1
PDF	https://arxiv.org/pdf/1911.11916v1.pdf
PWC	https://paperswithcode.com/paper/data-augmentation-using-adversarial-training
Repo
Framework

Design of a Simple Orthogonal Multiwavelet Filter by Matrix Spectral Factorization


Title	Design of a Simple Orthogonal Multiwavelet Filter by Matrix Spectral Factorization
Authors	Vasil Kolev, Todor Cooklev, Fritz Keinert
Abstract	We consider the design of an orthogonal symmetric/antisymmetric multiwavelet from its matrix product filter by matrix spectral factorization (MSF). As a test problem, we construct a simple matrix product filter with desirable properties, and factor it using Bauer’s method, which in this case can be done in closed form. The corresponding orthogonal multiwavelet function is derived using algebraic techniques which allow symmetry to be considered. This leads to the known orthogonal multiwavelet SA1, which can also be derived directly. We also give a lifting scheme for SA1, investigate the influence of the number of significant digits in the calculations, and show some numerical experiments.
Tasks
Published	2019-10-16
URL	https://arxiv.org/abs/1910.07133v1
PDF	https://arxiv.org/pdf/1910.07133v1.pdf
PWC	https://paperswithcode.com/paper/design-of-a-simple-orthogonal-multiwavelet
Repo
Framework

Factor Analysis on Citation, Using a Combined Latent and Logistic Regression Model


Title	Factor Analysis on Citation, Using a Combined Latent and Logistic Regression Model
Authors	Namjoon Suh, Xiaoming Huo, Eric Heim, Lee Seversky
Abstract	We propose a combined model, which integrates the latent factor model and the logistic regression model, for the citation network. It is noticed that neither a latent factor model nor a logistic regression model alone is sufficient to capture the structure of the data. The proposed model has a latent (i.e., factor analysis) model to represents the main technological trends (a.k.a., factors), and adds a sparse component that captures the remaining ad-hoc dependence. Parameter estimation is carried out through the construction of a joint-likelihood function of edges and properly chosen penalty terms. The convexity of the objective function allows us to develop an efficient algorithm, while the penalty terms push towards a low-dimensional latent component and a sparse graphical structure. Simulation results show that the proposed method works well in practical situations. The proposed method has been applied to a real application, which contains a citation network of statisticians (Ji and Jin, 2016). Some interesting findings are reported.
Tasks
Published	2019-12-02
URL	https://arxiv.org/abs/1912.00524v1
PDF	https://arxiv.org/pdf/1912.00524v1.pdf
PWC	https://paperswithcode.com/paper/factor-analysis-on-citation-using-a-combined
Repo
Framework

Superpixel-Based Background Recovery from Multiple Images


Title	Superpixel-Based Background Recovery from Multiple Images
Authors	Lei Gao, Yixing Huang, Andreas Maier
Abstract	In this paper, we propose an intuitive method to recover background from multiple images. The implementation consists of three stages: model initialization, model update, and background output. We consider the pixels whose values change little in all input images as background seeds. Images are then segmented into superpixels with simple linear iterative clustering. When the number of pixels labelled as background in a superpixel is bigger than a predefined threshold, we label the superpixel as background to initialize the background candidate masks. Background candidate images are obtained from input raw images with the masks. Combining all candidate images, a background image is produced. The background candidate masks, candidate images, and the background image are then updated alternately until convergence. Finally, ghosting artifacts is removed with the k-nearest neighbour method. An experiment on an outdoor dataset demonstrates that the proposed algorithm can achieve promising results.
Tasks
Published	2019-11-04
URL	https://arxiv.org/abs/1911.01223v1
PDF	https://arxiv.org/pdf/1911.01223v1.pdf
PWC	https://paperswithcode.com/paper/superpixel-based-background-recovery-from
Repo
Framework

Task-Oriented Dialog Systems that Consider Multiple Appropriate Responses under the Same Context


Title	Task-Oriented Dialog Systems that Consider Multiple Appropriate Responses under the Same Context
Authors	Yichi Zhang, Zhijian Ou, Zhou Yu
Abstract	Conversations have an intrinsic one-to-many property, which means that multiple responses can be appropriate for the same dialog context. In task-oriented dialogs, this property leads to different valid dialog policies towards task completion. However, none of the existing task-oriented dialog generation approaches takes this property into account. We propose a Multi-Action Data Augmentation (MADA) framework to utilize the one-to-many property to generate diverse appropriate dialog responses. Specifically, we first use dialog states to summarize the dialog history, and then discover all possible mappings from every dialog state to its different valid system actions. During dialog system training, we enable the current dialog state to map to all valid system actions discovered in the previous process to create additional state-action pairs. By incorporating these additional pairs, the dialog policy learns a balanced action distribution, which further guides the dialog model to generate diverse responses. Experimental results show that the proposed framework consistently improves dialog policy diversity, and results in improved response diversity and appropriateness. Our model obtains state-of-the-art results on MultiWOZ.
Tasks	Data Augmentation
Published	2019-11-24
URL	https://arxiv.org/abs/1911.10484v2
PDF	https://arxiv.org/pdf/1911.10484v2.pdf
PWC	https://paperswithcode.com/paper/task-oriented-dialog-systems-that-consider
Repo
Framework

The Language of Dialogue Is Complex


Title	The Language of Dialogue Is Complex
Authors	Alexander Robertson, Luca Maria Aiello, Daniele Quercia
Abstract	Integrative Complexity (IC) is a psychometric that measures the ability of a person to recognize multiple perspectives and connect them, thus identifying paths for conflict resolution. IC has been linked to a wide variety of political, social and personal outcomes but evaluating it is a time-consuming process requiring skilled professionals to manually score texts, a fact which accounts for the limited exploration of IC at scale on social media.We combine natural language processing and machine learning to train an IC classification model that achieves state-of-the-art performance on unseen data and more closely adheres to the established structure of the IC coding process than previous automated approaches. When applied to the content of 400k+ comments from online fora about depression and knowledge exchange, our model was capable of replicating key findings of prior work, thus providing the first example of using IC tools for large-scale social media analytics.
Tasks
Published	2019-06-05
URL	https://arxiv.org/abs/1906.02057v1
PDF	https://arxiv.org/pdf/1906.02057v1.pdf
PWC	https://paperswithcode.com/paper/the-language-of-dialogue-is-complex
Repo
Framework

Open Set Authorship Attribution toward Demystifying Victorian Periodicals


Title	Open Set Authorship Attribution toward Demystifying Victorian Periodicals
Authors	Sarkhan Badirli, Mary Borgo Ton, Abdulmecit Gungor, Murat Dundar
Abstract	Existing research in computational authorship attribution (AA) has primarily focused on attribution tasks with a limited number of authors in a closed-set configuration. This restricted set-up is far from being realistic in dealing with highly entangled real-world AA tasks that involve a large number of candidate authors for attribution during test time. In this paper, we study AA in historical texts using anew data set compiled from the Victorian literature. We investigate the predictive capacity of most common English words in distinguishing writings of most prominent Victorian novelists. We challenged the closed-set classification assumption and discussed the limitations of standard machine learning techniques in dealing with the open set AA task. Our experiments suggest that a linear classifier can achieve near perfect attribution accuracy under closed set assumption yet, the need for more robust approaches becomes evident once a large candidate pool has to be considered in the open-set classification setting.
Tasks
Published	2019-12-17
URL	https://arxiv.org/abs/1912.08259v1
PDF	https://arxiv.org/pdf/1912.08259v1.pdf
PWC	https://paperswithcode.com/paper/open-set-authorship-attribution-toward
Repo
Framework

On the Feasibility of Automated Detection of Allusive Text Reuse


Title	On the Feasibility of Automated Detection of Allusive Text Reuse
Authors	Enrique Manjavacas, Brian Long, Mike Kestemont
Abstract	The detection of allusive text reuse is particularly challenging due to the sparse evidence on which allusive references rely—commonly based on none or very few shared words. Arguably, lexical semantics can be resorted to since uncovering semantic relations between words has the potential to increase the support underlying the allusion and alleviate the lexical sparsity. A further obstacle is the lack of evaluation benchmark corpora, largely due to the highly interpretative character of the annotation process. In the present paper, we aim to elucidate the feasibility of automated allusion detection. We approach the matter from an Information Retrieval perspective in which referencing texts act as queries and referenced texts as relevant documents to be retrieved, and estimate the difficulty of benchmark corpus compilation by a novel inter-annotator agreement study on query segmentation. Furthermore, we investigate to what extent the integration of lexical semantic information derived from distributional models and ontologies can aid retrieving cases of allusive reuse. The results show that (i) despite low agreement scores, using manual queries considerably improves retrieval performance with respect to a windowing approach, and that (ii) retrieval performance can be moderately boosted with distributional semantics.
Tasks	Information Retrieval
Published	2019-05-08
URL	https://arxiv.org/abs/1905.02973v1
PDF	https://arxiv.org/pdf/1905.02973v1.pdf
PWC	https://paperswithcode.com/paper/on-the-feasibility-of-automated-detection-of
Repo
Framework

Crop Height and Plot Estimation from Unmanned Aerial Vehicles using 3D LiDAR


Title	Crop Height and Plot Estimation from Unmanned Aerial Vehicles using 3D LiDAR
Authors	Harnaik Dhami, Kevin Yu, Tianshu Xu, Qian Zhu, Kshitiz Dhakal, James Friel, Song Li, Pratap Tokekar
Abstract	We present techniques to measure crop heights using a 3D Light Detection and Ranging (LiDAR) sensor mounted on an Unmanned Aerial Vehicle (UAV). Knowing the height of plants is crucial to monitor their overall health and growth cycles, especially for high-throughput plant phenotyping. We present a methodology for extracting plant heights from 3D LiDAR point clouds, specifically focusing on plot-based phenotyping environments. We also present a toolchain that can be used to create phenotyping farms for use in Gazebo simulations. The tool creates a randomized farm with realistic 3D plant and terrain models. We conducted a series of simulations and hardware experiments in controlled and natural settings. Our algorithm was able to estimate the plant heights in a field with 112 plots with a root mean square error (RMSE) of 6.1 cm. This is the first such dataset for 3D LiDAR from an airborne robot over a wheat field. The developed simulation toolchain, algorithmic implementation, and datasets can be found on the GitHub repository located at https://github.com/hsd1121/PointCloudProcessing.
Tasks
Published	2019-10-30
URL	https://arxiv.org/abs/1910.14031v2
PDF	https://arxiv.org/pdf/1910.14031v2.pdf
PWC	https://paperswithcode.com/paper/crop-height-and-plot-estimation-from-unmanned
Repo
Framework