Paper Group ANR 1235
Flexible Conditional Image Generation of Missing Data with Learned Mental Maps. Business Process Variant Analysis based on Mutual Fingerprints of Event Logs. How Effectively Can Indoor Wireless Positioning Relieve Visual Tracking Pains: A Camera-Rao Bound Viewpoint. SoftGAN: Learning generative models efficiently with application to CycleGAN Voice …
Flexible Conditional Image Generation of Missing Data with Learned Mental Maps
Title | Flexible Conditional Image Generation of Missing Data with Learned Mental Maps |
Authors | Benjamin Hou, Athanasios Vlontzos, Amir Alansary, Daniel Rueckert, Bernhard Kainz |
Abstract | Real-world settings often do not allow acquisition of high-resolution volumetric images for accurate morphological assessment and diagnostic. In clinical practice it is frequently common to acquire only sparse data (e.g. individual slices) for initial diagnostic decision making. Thereby, physicians rely on their prior knowledge (or mental maps) of the human anatomy to extrapolate the underlying 3D information. Accurate mental maps require years of anatomy training, which in the first instance relies on normative learning, i.e. excluding pathology. In this paper, we leverage Bayesian Deep Learning and environment mapping to generate full volumetric anatomy representations from none to a small, sparse set of slices. We evaluate proof of concept implementations based on Generative Query Networks (GQN) and Conditional BRUNO using abdominal CT and brain MRI as well as in a clinical application involving sparse, motion-corrupted MR acquisition for fetal imaging. Our approach allows to reconstruct 3D volumes from 1 to 4 tomographic slices, with a SSIM of 0.7+ and cross-correlation of 0.8+ compared to the 3D ground truth. |
Tasks | Conditional Image Generation, Decision Making, Image Generation |
Published | 2019-08-29 |
URL | https://arxiv.org/abs/1908.11312v1 |
https://arxiv.org/pdf/1908.11312v1.pdf | |
PWC | https://paperswithcode.com/paper/flexible-conditional-image-generation-of |
Repo | |
Framework | |
Business Process Variant Analysis based on Mutual Fingerprints of Event Logs
Title | Business Process Variant Analysis based on Mutual Fingerprints of Event Logs |
Authors | Farbod Taymouri, Marcello La Rosa, Josep Carmona |
Abstract | Comparing business process variants using event logs is a common use case in process mining. Existing techniques for process variant analysis detect statistically-significant differences between variants at the level of individual entities (such as process activities) and their relationships (e.g. directly-follows relations between activities). This may lead to a proliferation of differences due to the low level of granularity in which such differences are captured. This paper presents a novel approach to detect statistically-significant differences between variants at the level of entire process traces (i.e. sequences of directly-follows relations). The cornerstone of this approach is a technique to learn a directly follows graph called mutual fingerprint from the event logs of the two variants. A mutual fingerprint is a lossless encoding of a set of traces and their duration using discrete wavelet transformation. This structure facilitates the understanding of statistical differences along the control-flow and performance dimensions. The approach has been evaluated using real-life event logs against two baselines. The results show that at a trace level, the baselines cannot always reveal the differences discovered by our approach, or can detect spurious differences. |
Tasks | |
Published | 2019-12-23 |
URL | https://arxiv.org/abs/1912.10598v2 |
https://arxiv.org/pdf/1912.10598v2.pdf | |
PWC | https://paperswithcode.com/paper/business-process-variant-analysis-based-on |
Repo | |
Framework | |
How Effectively Can Indoor Wireless Positioning Relieve Visual Tracking Pains: A Camera-Rao Bound Viewpoint
Title | How Effectively Can Indoor Wireless Positioning Relieve Visual Tracking Pains: A Camera-Rao Bound Viewpoint |
Authors | Panwen Hu, Zizheng Yan, Rui Huang, Feng Yin |
Abstract | Visual tracking is fragile in some difficult scenarios, for instance, appearance ambiguity and variation, occlusion can easily degrade most of visual trackers to some extent. In this paper, visual tracking is empowered with wireless positioning to achieve high accuracy while maintaining robustness. Fundamentally different from the previous works, this study does not involve any specific wireless positioning algorithms. Instead, we use the confidence region derived from the wireless positioning Cramer-Rao bound (CRB) as the search region of visual trackers. The proposed framework is low-cost and very simple to implement, yet readily leads to enhanced and robustified visual tracking performance in difficult scenarios as corroborated by our experimental results. Most importantly, it is utmost valuable for the practioners to pre-evaluate how effectively can the wireless resources available at hand alleviate the visual tracking pains. |
Tasks | Visual Tracking |
Published | 2019-03-09 |
URL | http://arxiv.org/abs/1903.03736v1 |
http://arxiv.org/pdf/1903.03736v1.pdf | |
PWC | https://paperswithcode.com/paper/how-effectively-can-indoor-wireless |
Repo | |
Framework | |
SoftGAN: Learning generative models efficiently with application to CycleGAN Voice Conversion
Title | SoftGAN: Learning generative models efficiently with application to CycleGAN Voice Conversion |
Authors | Rafael Ferro, Nicolas Obin, Axel Roebel |
Abstract | Voice conversion with deep neural networks has become extremely popular over the last few years with improvements over the past VC architectures. In particular, GAN architectures such as the cycleGAN and the VAEGAN have offered the possibility to learn voice conversion from non-parallel databases. However, GAN-based methods are highly unstable, requiring often a careful tuning of hyper-parameters, and can lead to poor voice identity conversion and substantially degraded converted speech signal. This paper discusses and tackles the stability issues of the GAN in the context of voice conversion. The proposed SoftGAN method aims at reducing the impact of the generator on the discriminator and vice versa during training, so both can learn more gradually and efficiently during training, in particular avoiding a training not in tandem. A subjective experiment conducted on a voice conversion task on the voice conversion challenge 2018 dataset shows that the proposed SoftGAN significantly improves the quality of the voice conversion while preserving the naturalness of the converted speech. |
Tasks | Voice Conversion |
Published | 2019-10-22 |
URL | https://arxiv.org/abs/1910.12614v1 |
https://arxiv.org/pdf/1910.12614v1.pdf | |
PWC | https://paperswithcode.com/paper/softgan-learning-generative-models |
Repo | |
Framework | |
Blind Inpainting of Large-scale Masks of Thin Structures with Adversarial and Reinforcement Learning
Title | Blind Inpainting of Large-scale Masks of Thin Structures with Adversarial and Reinforcement Learning |
Authors | Hao Chen, Mario Valerio Giuffrida, Peter Doerner, Sotirios A. Tsaftaris |
Abstract | Several imaging applications (vessels, retina, plant roots, road networks from satellites) require the accurate segmentation of thin structures for subsequent analysis. Discontinuities (gaps) in the extracted foreground may hinder down-stream image-based analysis of biomarkers, organ structure and topology. In this paper, we propose a general post-processing technique to recover such gaps in large-scale segmentation masks. We cast this problem as a blind inpainting task, where the regions of missing lines in the segmentation masks are not known to the algorithm, which we solve with an adversarially trained neural network. One challenge of using large images is the memory capacity of current GPUs. The typical approach of dividing a large image into smaller patches to train the network does not guarantee global coherence of the reconstructed image that preserves structure and topology. We use adversarial training and reinforcement learning (Policy Gradient) to endow the model with both global context and local details. We evaluate our method in several datasets in medical imaging, plant science, and remote sensing. Our experiments demonstrate that our model produces the most realistic and complete inpainted results, outperforming other approaches. In a dedicated study on plant roots we find that our approach is also comparable to human performance. Implementation available at \url{https://github.com/Hhhhhhhhhhao/Thin-Structure-Inpainting}. |
Tasks | |
Published | 2019-12-05 |
URL | https://arxiv.org/abs/1912.02470v1 |
https://arxiv.org/pdf/1912.02470v1.pdf | |
PWC | https://paperswithcode.com/paper/blind-inpainting-of-large-scale-masks-of-thin |
Repo | |
Framework | |
Risk bounds for reservoir computing
Title | Risk bounds for reservoir computing |
Authors | Lukas Gonon, Lyudmila Grigoryeva, Juan-Pablo Ortega |
Abstract | We analyze the practices of reservoir computing in the framework of statistical learning theory. In particular, we derive finite sample upper bounds for the generalization error committed by specific families of reservoir computing systems when processing discrete-time inputs under various hypotheses on their dependence structure. Non-asymptotic bounds are explicitly written down in terms of the multivariate Rademacher complexities of the reservoir systems and the weak dependence structure of the signals that are being handled. This allows, in particular, to determine the minimal number of observations needed in order to guarantee a prescribed estimation accuracy with high probability for a given reservoir family. At the same time, the asymptotic behavior of the devised bounds guarantees the consistency of the empirical risk minimization procedure for various hypothesis classes of reservoir functionals. |
Tasks | |
Published | 2019-10-30 |
URL | https://arxiv.org/abs/1910.13886v1 |
https://arxiv.org/pdf/1910.13886v1.pdf | |
PWC | https://paperswithcode.com/paper/risk-bounds-for-reservoir-computing |
Repo | |
Framework | |
Data Augmentation Using Adversarial Training for Construction-Equipment Classification
Title | Data Augmentation Using Adversarial Training for Construction-Equipment Classification |
Authors | Francis Baek, Somin Park, Hyoungkwan Kim |
Abstract | Deep learning-based construction-site image analysis has recently made great progress with regard to accuracy and speed, but it requires a large amount of data. Acquiring sufficient amount of labeled construction-image data is a prerequisite for deep learning-based construction-image recognition and requires considerable time and effort. In this paper, we propose a “data augmentation” scheme based on generative adversarial networks (GANs) for construction-equipment classification. The proposed method combines a GAN and additional “adversarial training” to stably perform “data augmentation” for construction equipment. The “data augmentation” was verified via binary classification experiments involving excavator images, and the average accuracy improvement was 4.094%. In the experiment, three image sizes (32-32-3, 64-64-3, and 128-128-3) and 120, 240, and 480 training samples were used to demonstrate the robustness of the proposed method. These results demonstrated that the proposed method can effectively and reliably generate construction-equipment images and train deep learning-based classifiers for construction equipment. |
Tasks | Data Augmentation |
Published | 2019-11-27 |
URL | https://arxiv.org/abs/1911.11916v1 |
https://arxiv.org/pdf/1911.11916v1.pdf | |
PWC | https://paperswithcode.com/paper/data-augmentation-using-adversarial-training |
Repo | |
Framework | |
Design of a Simple Orthogonal Multiwavelet Filter by Matrix Spectral Factorization
Title | Design of a Simple Orthogonal Multiwavelet Filter by Matrix Spectral Factorization |
Authors | Vasil Kolev, Todor Cooklev, Fritz Keinert |
Abstract | We consider the design of an orthogonal symmetric/antisymmetric multiwavelet from its matrix product filter by matrix spectral factorization (MSF). As a test problem, we construct a simple matrix product filter with desirable properties, and factor it using Bauer’s method, which in this case can be done in closed form. The corresponding orthogonal multiwavelet function is derived using algebraic techniques which allow symmetry to be considered. This leads to the known orthogonal multiwavelet SA1, which can also be derived directly. We also give a lifting scheme for SA1, investigate the influence of the number of significant digits in the calculations, and show some numerical experiments. |
Tasks | |
Published | 2019-10-16 |
URL | https://arxiv.org/abs/1910.07133v1 |
https://arxiv.org/pdf/1910.07133v1.pdf | |
PWC | https://paperswithcode.com/paper/design-of-a-simple-orthogonal-multiwavelet |
Repo | |
Framework | |
Factor Analysis on Citation, Using a Combined Latent and Logistic Regression Model
Title | Factor Analysis on Citation, Using a Combined Latent and Logistic Regression Model |
Authors | Namjoon Suh, Xiaoming Huo, Eric Heim, Lee Seversky |
Abstract | We propose a combined model, which integrates the latent factor model and the logistic regression model, for the citation network. It is noticed that neither a latent factor model nor a logistic regression model alone is sufficient to capture the structure of the data. The proposed model has a latent (i.e., factor analysis) model to represents the main technological trends (a.k.a., factors), and adds a sparse component that captures the remaining ad-hoc dependence. Parameter estimation is carried out through the construction of a joint-likelihood function of edges and properly chosen penalty terms. The convexity of the objective function allows us to develop an efficient algorithm, while the penalty terms push towards a low-dimensional latent component and a sparse graphical structure. Simulation results show that the proposed method works well in practical situations. The proposed method has been applied to a real application, which contains a citation network of statisticians (Ji and Jin, 2016). Some interesting findings are reported. |
Tasks | |
Published | 2019-12-02 |
URL | https://arxiv.org/abs/1912.00524v1 |
https://arxiv.org/pdf/1912.00524v1.pdf | |
PWC | https://paperswithcode.com/paper/factor-analysis-on-citation-using-a-combined |
Repo | |
Framework | |
Superpixel-Based Background Recovery from Multiple Images
Title | Superpixel-Based Background Recovery from Multiple Images |
Authors | Lei Gao, Yixing Huang, Andreas Maier |
Abstract | In this paper, we propose an intuitive method to recover background from multiple images. The implementation consists of three stages: model initialization, model update, and background output. We consider the pixels whose values change little in all input images as background seeds. Images are then segmented into superpixels with simple linear iterative clustering. When the number of pixels labelled as background in a superpixel is bigger than a predefined threshold, we label the superpixel as background to initialize the background candidate masks. Background candidate images are obtained from input raw images with the masks. Combining all candidate images, a background image is produced. The background candidate masks, candidate images, and the background image are then updated alternately until convergence. Finally, ghosting artifacts is removed with the k-nearest neighbour method. An experiment on an outdoor dataset demonstrates that the proposed algorithm can achieve promising results. |
Tasks | |
Published | 2019-11-04 |
URL | https://arxiv.org/abs/1911.01223v1 |
https://arxiv.org/pdf/1911.01223v1.pdf | |
PWC | https://paperswithcode.com/paper/superpixel-based-background-recovery-from |
Repo | |
Framework | |
Task-Oriented Dialog Systems that Consider Multiple Appropriate Responses under the Same Context
Title | Task-Oriented Dialog Systems that Consider Multiple Appropriate Responses under the Same Context |
Authors | Yichi Zhang, Zhijian Ou, Zhou Yu |
Abstract | Conversations have an intrinsic one-to-many property, which means that multiple responses can be appropriate for the same dialog context. In task-oriented dialogs, this property leads to different valid dialog policies towards task completion. However, none of the existing task-oriented dialog generation approaches takes this property into account. We propose a Multi-Action Data Augmentation (MADA) framework to utilize the one-to-many property to generate diverse appropriate dialog responses. Specifically, we first use dialog states to summarize the dialog history, and then discover all possible mappings from every dialog state to its different valid system actions. During dialog system training, we enable the current dialog state to map to all valid system actions discovered in the previous process to create additional state-action pairs. By incorporating these additional pairs, the dialog policy learns a balanced action distribution, which further guides the dialog model to generate diverse responses. Experimental results show that the proposed framework consistently improves dialog policy diversity, and results in improved response diversity and appropriateness. Our model obtains state-of-the-art results on MultiWOZ. |
Tasks | Data Augmentation |
Published | 2019-11-24 |
URL | https://arxiv.org/abs/1911.10484v2 |
https://arxiv.org/pdf/1911.10484v2.pdf | |
PWC | https://paperswithcode.com/paper/task-oriented-dialog-systems-that-consider |
Repo | |
Framework | |
The Language of Dialogue Is Complex
Title | The Language of Dialogue Is Complex |
Authors | Alexander Robertson, Luca Maria Aiello, Daniele Quercia |
Abstract | Integrative Complexity (IC) is a psychometric that measures the ability of a person to recognize multiple perspectives and connect them, thus identifying paths for conflict resolution. IC has been linked to a wide variety of political, social and personal outcomes but evaluating it is a time-consuming process requiring skilled professionals to manually score texts, a fact which accounts for the limited exploration of IC at scale on social media.We combine natural language processing and machine learning to train an IC classification model that achieves state-of-the-art performance on unseen data and more closely adheres to the established structure of the IC coding process than previous automated approaches. When applied to the content of 400k+ comments from online fora about depression and knowledge exchange, our model was capable of replicating key findings of prior work, thus providing the first example of using IC tools for large-scale social media analytics. |
Tasks | |
Published | 2019-06-05 |
URL | https://arxiv.org/abs/1906.02057v1 |
https://arxiv.org/pdf/1906.02057v1.pdf | |
PWC | https://paperswithcode.com/paper/the-language-of-dialogue-is-complex |
Repo | |
Framework | |
Open Set Authorship Attribution toward Demystifying Victorian Periodicals
Title | Open Set Authorship Attribution toward Demystifying Victorian Periodicals |
Authors | Sarkhan Badirli, Mary Borgo Ton, Abdulmecit Gungor, Murat Dundar |
Abstract | Existing research in computational authorship attribution (AA) has primarily focused on attribution tasks with a limited number of authors in a closed-set configuration. This restricted set-up is far from being realistic in dealing with highly entangled real-world AA tasks that involve a large number of candidate authors for attribution during test time. In this paper, we study AA in historical texts using anew data set compiled from the Victorian literature. We investigate the predictive capacity of most common English words in distinguishing writings of most prominent Victorian novelists. We challenged the closed-set classification assumption and discussed the limitations of standard machine learning techniques in dealing with the open set AA task. Our experiments suggest that a linear classifier can achieve near perfect attribution accuracy under closed set assumption yet, the need for more robust approaches becomes evident once a large candidate pool has to be considered in the open-set classification setting. |
Tasks | |
Published | 2019-12-17 |
URL | https://arxiv.org/abs/1912.08259v1 |
https://arxiv.org/pdf/1912.08259v1.pdf | |
PWC | https://paperswithcode.com/paper/open-set-authorship-attribution-toward |
Repo | |
Framework | |
On the Feasibility of Automated Detection of Allusive Text Reuse
Title | On the Feasibility of Automated Detection of Allusive Text Reuse |
Authors | Enrique Manjavacas, Brian Long, Mike Kestemont |
Abstract | The detection of allusive text reuse is particularly challenging due to the sparse evidence on which allusive references rely—commonly based on none or very few shared words. Arguably, lexical semantics can be resorted to since uncovering semantic relations between words has the potential to increase the support underlying the allusion and alleviate the lexical sparsity. A further obstacle is the lack of evaluation benchmark corpora, largely due to the highly interpretative character of the annotation process. In the present paper, we aim to elucidate the feasibility of automated allusion detection. We approach the matter from an Information Retrieval perspective in which referencing texts act as queries and referenced texts as relevant documents to be retrieved, and estimate the difficulty of benchmark corpus compilation by a novel inter-annotator agreement study on query segmentation. Furthermore, we investigate to what extent the integration of lexical semantic information derived from distributional models and ontologies can aid retrieving cases of allusive reuse. The results show that (i) despite low agreement scores, using manual queries considerably improves retrieval performance with respect to a windowing approach, and that (ii) retrieval performance can be moderately boosted with distributional semantics. |
Tasks | Information Retrieval |
Published | 2019-05-08 |
URL | https://arxiv.org/abs/1905.02973v1 |
https://arxiv.org/pdf/1905.02973v1.pdf | |
PWC | https://paperswithcode.com/paper/on-the-feasibility-of-automated-detection-of |
Repo | |
Framework | |
Crop Height and Plot Estimation from Unmanned Aerial Vehicles using 3D LiDAR
Title | Crop Height and Plot Estimation from Unmanned Aerial Vehicles using 3D LiDAR |
Authors | Harnaik Dhami, Kevin Yu, Tianshu Xu, Qian Zhu, Kshitiz Dhakal, James Friel, Song Li, Pratap Tokekar |
Abstract | We present techniques to measure crop heights using a 3D Light Detection and Ranging (LiDAR) sensor mounted on an Unmanned Aerial Vehicle (UAV). Knowing the height of plants is crucial to monitor their overall health and growth cycles, especially for high-throughput plant phenotyping. We present a methodology for extracting plant heights from 3D LiDAR point clouds, specifically focusing on plot-based phenotyping environments. We also present a toolchain that can be used to create phenotyping farms for use in Gazebo simulations. The tool creates a randomized farm with realistic 3D plant and terrain models. We conducted a series of simulations and hardware experiments in controlled and natural settings. Our algorithm was able to estimate the plant heights in a field with 112 plots with a root mean square error (RMSE) of 6.1 cm. This is the first such dataset for 3D LiDAR from an airborne robot over a wheat field. The developed simulation toolchain, algorithmic implementation, and datasets can be found on the GitHub repository located at https://github.com/hsd1121/PointCloudProcessing. |
Tasks | |
Published | 2019-10-30 |
URL | https://arxiv.org/abs/1910.14031v2 |
https://arxiv.org/pdf/1910.14031v2.pdf | |
PWC | https://paperswithcode.com/paper/crop-height-and-plot-estimation-from-unmanned |
Repo | |
Framework | |