Paper Group ANR 652
Indiscapes: Instance Segmentation Networks for Layout Parsing of Historical Indic Manuscripts. Distributional Reinforcement Learning for Energy-Based Sequential Models. GAN-powered Deep Distributional Reinforcement Learning for Resource Management in Network Slicing. Improved Path-length Regret Bounds for Bandits. Recognition of Images of Korean Ch …
Indiscapes: Instance Segmentation Networks for Layout Parsing of Historical Indic Manuscripts
Title | Indiscapes: Instance Segmentation Networks for Layout Parsing of Historical Indic Manuscripts |
Authors | Abhishek Prusty, Sowmya Aitha, Abhishek Trivedi, Ravi Kiran Sarvadevabhatla |
Abstract | Historical palm-leaf manuscript and early paper documents from Indian subcontinent form an important part of the world’s literary and cultural heritage. Despite their importance, large-scale annotated Indic manuscript image datasets do not exist. To address this deficiency, we introduce Indiscapes, the first ever dataset with multi-regional layout annotations for historical Indic manuscripts. To address the challenge of large diversity in scripts and presence of dense, irregular layout elements (e.g. text lines, pictures, multiple documents per image), we adapt a Fully Convolutional Deep Neural Network architecture for fully automatic, instance-level spatial layout parsing of manuscript images. We demonstrate the effectiveness of proposed architecture on images from the Indiscapes dataset. For annotation flexibility and keeping the non-technical nature of domain experts in mind, we also contribute a custom, web-based GUI annotation tool and a dashboard-style analytics portal. Overall, our contributions set the stage for enabling downstream applications such as OCR and word-spotting in historical Indic manuscripts at scale. |
Tasks | Instance Segmentation, Optical Character Recognition, Semantic Segmentation |
Published | 2019-12-15 |
URL | https://arxiv.org/abs/1912.07025v1 |
https://arxiv.org/pdf/1912.07025v1.pdf | |
PWC | https://paperswithcode.com/paper/indiscapes-instance-segmentation-networks-for |
Repo | |
Framework | |
Distributional Reinforcement Learning for Energy-Based Sequential Models
Title | Distributional Reinforcement Learning for Energy-Based Sequential Models |
Authors | Tetiana Parshakova, Jean-Marc Andreoli, Marc Dymetman |
Abstract | Global Autoregressive Models (GAMs) are a recent proposal [Parshakova et al., CoNLL 2019] for exploiting global properties of sequences for data-efficient learning of seq2seq models. In the first phase of training, an Energy-Based model (EBM) over sequences is derived. This EBM has high representational power, but is unnormalized and cannot be directly exploited for sampling. To address this issue [Parshakova et al., CoNLL 2019] proposes a distillation technique, which can only be applied under limited conditions. By relating this problem to Policy Gradient techniques in RL, but in a \emph{distributional} rather than \emph{optimization} perspective, we propose a general approach applicable to any sequential EBM. Its effectiveness is illustrated on GAM-based experiments. |
Tasks | Distributional Reinforcement Learning |
Published | 2019-12-18 |
URL | https://arxiv.org/abs/1912.08517v1 |
https://arxiv.org/pdf/1912.08517v1.pdf | |
PWC | https://paperswithcode.com/paper/distributional-reinforcement-learning-for-1 |
Repo | |
Framework | |
GAN-powered Deep Distributional Reinforcement Learning for Resource Management in Network Slicing
Title | GAN-powered Deep Distributional Reinforcement Learning for Resource Management in Network Slicing |
Authors | Yuxiu Hua, Rongpeng Li, Zhifeng Zhao, Xianfu Chen, Honggang Zhang |
Abstract | Network slicing is a key technology in 5G communications system. Its purpose is to dynamically and efficiently allocate resources for diversified services with distinct requirements over a common underlying physical infrastructure. Therein, demand-aware resource allocation is of significant importance to network slicing. In this paper, we consider a scenario that contains several slices in a radio access network with base stations that share the same physical resources (e.g., bandwidth or slots). We leverage deep reinforcement learning (DRL) to solve this problem by considering the varying service demands as the environment state and the allocated resources as the environment action. In order to reduce the effects of the annoying randomness and noise embedded in the received service level agreement (SLA) satisfaction ratio (SSR) and spectrum efficiency (SE), we primarily propose generative adversarial network-powered deep distributional Q network (GAN-DDQN) to learn the action-value distribution driven by minimizing the discrepancy between the estimated action-value distribution and the target action-value distribution. We put forward a reward-clipping mechanism to stabilize GAN-DDQN training against the effects of widely-spanning utility values. Moreover, we further develop Dueling GAN-DDQN, which uses a specially designed dueling generator, to learn the action-value distribution by estimating the state-value distribution and the action advantage function. Finally, we verify the performance of the proposed GAN-DDQN and Dueling GAN-DDQN algorithms through extensive simulations. |
Tasks | Distributional Reinforcement Learning |
Published | 2019-05-10 |
URL | https://arxiv.org/abs/1905.03929v3 |
https://arxiv.org/pdf/1905.03929v3.pdf | |
PWC | https://paperswithcode.com/paper/gan-based-deep-distributional-reinforcement |
Repo | |
Framework | |
Improved Path-length Regret Bounds for Bandits
Title | Improved Path-length Regret Bounds for Bandits |
Authors | Sébastien Bubeck, Yuanzhi Li, Haipeng Luo, Chen-Yu Wei |
Abstract | We study adaptive regret bounds in terms of the variation of the losses (the so-called path-length bounds) for both multi-armed bandit and more generally linear bandit. We first show that the seemingly suboptimal path-length bound of (Wei and Luo, 2018) is in fact not improvable for adaptive adversary. Despite this negative result, we then develop two new algorithms, one that strictly improves over (Wei and Luo, 2018) with a smaller path-length measure, and the other which improves over (Wei and Luo, 2018) for oblivious adversary when the path-length is large. Our algorithms are based on the well-studied optimistic mirror descent framework, but importantly with several novel techniques, including new optimistic predictions, a slight bias towards recently selected arms, and the use of a hybrid regularizer similar to that of (Bubeck et al., 2018). Furthermore, we extend our results to linear bandit by showing a reduction to obtaining dynamic regret for a full-information problem, followed by a further reduction to convex body chasing. We propose a simple greedy chasing algorithm for squared 2-norm, leading to new dynamic regret results and as a consequence the first path-length regret for general linear bandit as well. |
Tasks | |
Published | 2019-01-29 |
URL | https://arxiv.org/abs/1901.10604v2 |
https://arxiv.org/pdf/1901.10604v2.pdf | |
PWC | https://paperswithcode.com/paper/improved-path-length-regret-bounds-for |
Repo | |
Framework | |
Recognition of Images of Korean Characters Using Embedded Networks
Title | Recognition of Images of Korean Characters Using Embedded Networks |
Authors | Sergey A. Ilyuhin, Alexander V. Sheshkus, Vladimir L. Arlazarov |
Abstract | Despite the significant success in the field of text recognition, complex and unsolved problems still exist in this field. In recent years, the recognition accuracy of the English language has greatly increased, while the problem of recognition of hieroglyphs has received much less attention. Hieroglyph recognition or image recognition with Korean, Japanese or Chinese characters have differences from the traditional text recognition task. This article discusses the main differences between hieroglyph languages and the Latin alphabet in the context of image recognition. A light-weight method for recognizing images of the hieroglyphs is proposed and tested on a public dataset of Korean hieroglyph images. Despite the existing solutions, the proposed method is suitable for mobile devices. Its recognition accuracy is better than the accuracy of the open-source OCR framework. The presented method of training embedded net bases on the similarities in the recognition data. |
Tasks | Optical Character Recognition |
Published | 2019-11-11 |
URL | https://arxiv.org/abs/1911.04241v2 |
https://arxiv.org/pdf/1911.04241v2.pdf | |
PWC | https://paperswithcode.com/paper/recognition-of-images-of-korean-characters |
Repo | |
Framework | |
Shape Detection In 2D Ultrasound Images
Title | Shape Detection In 2D Ultrasound Images |
Authors | Ruturaj Gole, Haixia Wu, Subho Ghose |
Abstract | Ultrasound images are one of the most widely used techniques in clinical settings to analyze and detect different organs for study or diagnoses of diseases. The dependence on subjective opinions of experts such as radiologists calls for an automatic recognition and detection system that can provide an objective analysis. Previous work done on this topic is limited and can be classified by the organ of interest. Hybrid neural networks, linear and logistic regression models, 3D reconstructed models, and various machine learning techniques have been used to solve complex problems such as detection of lesions and cancer. Our project aims to use Dual Path Networks (DPN) to segment and detect shapes in ultrasound images taken from 3D printed models of the liver. Further the DPN deep architectures could be coupled with Fully Convolutional Network (FCN) to refine the results. Data denoised with various filters would be used to gauge how they fare against each other and provide the best results. Small amount of dataset works with DPNs, and hence, that should be appropriate for us as our dataset shall be limited in size. Moreover, the ultrasound scans shall need to be taken from different orientations of the scanner with respect to the organ, such that the training dataset can accurately perform segmentation and shape detection. |
Tasks | |
Published | 2019-11-22 |
URL | https://arxiv.org/abs/1911.09863v1 |
https://arxiv.org/pdf/1911.09863v1.pdf | |
PWC | https://paperswithcode.com/paper/shape-detection-in-2d-ultrasound-images |
Repo | |
Framework | |
Table-to-Text Natural Language Generation with Unseen Schemas
Title | Table-to-Text Natural Language Generation with Unseen Schemas |
Authors | Tianyu Liu, Wei Wei, William Yang Wang |
Abstract | Traditional table-to-text natural language generation (NLG) tasks focus on generating text from schemas that are already seen in the training set. This limitation curbs their generalizabilities towards real-world scenarios, where the schemas of input tables are potentially infinite. In this paper, we propose the new task of table-to-text NLG with unseen schemas, which specifically aims to test the generalization of NLG for input tables with attribute types that never appear during training. To do this, we construct a new benchmark dataset for this task. To deal with the problem of unseen attribute types, we propose a new model that first aligns unseen table schemas to seen ones, and then generates text with updated table representations. Experimental evaluation on the new benchmark demonstrates that our model outperforms baseline methods by a large margin. In addition, comparison with standard data-to-text settings shows the challenges and uniqueness of our proposed task. |
Tasks | Text Generation |
Published | 2019-11-09 |
URL | https://arxiv.org/abs/1911.03601v1 |
https://arxiv.org/pdf/1911.03601v1.pdf | |
PWC | https://paperswithcode.com/paper/table-to-text-natural-language-generation |
Repo | |
Framework | |
Autoencoding with XCSF
Title | Autoencoding with XCSF |
Authors | Richard J. Preen, Stewart W. Wilson, Larry Bull |
Abstract | Autoencoders enable data dimensionality reduction and are a key component of many (deep) learning systems. This article explores the use of the XCSF online evolutionary reinforcement learning system to perform autoencoding. Initial results using a neural network representation and combining artificial evolution with stochastic gradient descent, suggest it is an effective approach to data reduction. The approach adaptively subdivides the input domain into local approximations that are simpler than a global neural network solution. By allowing the number of neurons in the autoencoders to evolve, this further enables the emergence of an ensemble of structurally heterogeneous solutions to cover the problem space. In this case, networks of differing complexity are typically seen to cover different areas of the problem space. Furthermore, the rate of gradient descent applied to each layer is tuned via self-adaptive mutation, thereby reducing the parameter optimisation task. |
Tasks | Dimensionality Reduction |
Published | 2019-10-23 |
URL | https://arxiv.org/abs/1910.10579v2 |
https://arxiv.org/pdf/1910.10579v2.pdf | |
PWC | https://paperswithcode.com/paper/autoencoding-with-xcsf |
Repo | |
Framework | |
Safe Feature Elimination for Non-Negativity Constrained Convex Optimization
Title | Safe Feature Elimination for Non-Negativity Constrained Convex Optimization |
Authors | James Folberth, Stephen Becker |
Abstract | Inspired by recent work on safe feature elimination for $1$-norm regularized least-squares, we develop strategies to eliminate features from convex optimization problems with non-negativity constraints. Our strategy is safe in the sense that it will only remove features/coordinates from the problem when they are guaranteed to be zero at a solution. To perform feature elimination we use an accurate, but not optimal, primal-dual feasible pair, making our methods robust and able to be used on ill-conditioned problems. We supplement our feature elimination problem with a method to construct an accurate dual feasible point from an accurate primal feasible point; this allows us to use a first-order method to find an accurate primal feasible point, then use that point to construct an accurate dual feasible point and perform feature elimination. Under reasonable conditions, our feature elimination strategy will eventually eliminate all zero features from the problem. As an application of our methods we show how safe feature elimination can be used to robustly certify the uniqueness of non-negative least-squares (NNLS) problems. We give numerical examples on a well-conditioned synthetic NNLS problem and a on set of 40000 extremely ill-conditioned NNLS problems arising in a microscopy application. |
Tasks | |
Published | 2019-07-25 |
URL | https://arxiv.org/abs/1907.10831v2 |
https://arxiv.org/pdf/1907.10831v2.pdf | |
PWC | https://paperswithcode.com/paper/safe-feature-elimination-for-non-negativity |
Repo | |
Framework | |
Statistics and Samples in Distributional Reinforcement Learning
Title | Statistics and Samples in Distributional Reinforcement Learning |
Authors | Mark Rowland, Robert Dadashi, Saurabh Kumar, Rémi Munos, Marc G. Bellemare, Will Dabney |
Abstract | We present a unifying framework for designing and analysing distributional reinforcement learning (DRL) algorithms in terms of recursively estimating statistics of the return distribution. Our key insight is that DRL algorithms can be decomposed as the combination of some statistical estimator and a method for imputing a return distribution consistent with that set of statistics. With this new understanding, we are able to provide improved analyses of existing DRL algorithms as well as construct a new algorithm (EDRL) based upon estimation of the expectiles of the return distribution. We compare EDRL with existing methods on a variety of MDPs to illustrate concrete aspects of our analysis, and develop a deep RL variant of the algorithm, ER-DQN, which we evaluate on the Atari-57 suite of games. |
Tasks | Distributional Reinforcement Learning |
Published | 2019-02-21 |
URL | http://arxiv.org/abs/1902.08102v1 |
http://arxiv.org/pdf/1902.08102v1.pdf | |
PWC | https://paperswithcode.com/paper/statistics-and-samples-in-distributional |
Repo | |
Framework | |
DeceptionNet: Network-Driven Domain Randomization
Title | DeceptionNet: Network-Driven Domain Randomization |
Authors | Sergey Zakharov, Wadim Kehl, Slobodan Ilic |
Abstract | We present a novel approach to tackle domain adaptation between synthetic and real data. Instead, of employing “blind” domain randomization, i.e., augmenting synthetic renderings with random backgrounds or changing illumination and colorization, we leverage the task network as its own adversarial guide toward useful augmentations that maximize the uncertainty of the output. To this end, we design a min-max optimization scheme where a given task competes against a special deception network to minimize the task error subject to the specific constraints enforced by the deceiver. The deception network samples from a family of differentiable pixel-level perturbations and exploits the task architecture to find the most destructive augmentations. Unlike GAN-based approaches that require unlabeled data from the target domain, our method achieves robust mappings that scale well to multiple target distributions from source data alone. We apply our framework to the tasks of digit recognition on enhanced MNIST variants, classification and object pose estimation on the Cropped LineMOD dataset as well as semantic segmentation on the Cityscapes dataset and compare it to a number of domain adaptation approaches, thereby demonstrating similar results with superior generalization capabilities. |
Tasks | Colorization, Domain Adaptation, Pose Estimation, Semantic Segmentation |
Published | 2019-04-04 |
URL | https://arxiv.org/abs/1904.02750v2 |
https://arxiv.org/pdf/1904.02750v2.pdf | |
PWC | https://paperswithcode.com/paper/deceptionnet-network-driven-domain |
Repo | |
Framework | |
Spoken Conversational Search for General Knowledge
Title | Spoken Conversational Search for General Knowledge |
Authors | Lina M. Rojas-Barahona, Pascal Bellec, Benoit Besset, Martinho Dos-Santos, Johannes Heinecke, Munshi Asadullah, Olivier Le-Blouch, Jean Y. Lancien, Géraldine Damnati, Emmanuel Mory, Frédéric Herledan |
Abstract | We present a spoken conversational question answering proof of concept that is able to answer questions about general knowledge from Wikidata. The dialogue component does not only orchestrate various components but also solve coreferences and ellipsis. |
Tasks | Question Answering |
Published | 2019-09-26 |
URL | https://arxiv.org/abs/1909.11980v1 |
https://arxiv.org/pdf/1909.11980v1.pdf | |
PWC | https://paperswithcode.com/paper/spoken-conversational-search-for-general |
Repo | |
Framework | |
Convergence Analysis of Inexact Randomized Iterative Methods
Title | Convergence Analysis of Inexact Randomized Iterative Methods |
Authors | Nicolas Loizou, Peter Richtárik |
Abstract | In this paper we present a convergence rate analysis of inexact variants of several randomized iterative methods. Among the methods studied are: stochastic gradient descent, stochastic Newton, stochastic proximal point and stochastic subspace ascent. A common feature of these methods is that in their update rule a certain sub-problem needs to be solved exactly. We relax this requirement by allowing for the sub-problem to be solved inexactly. In particular, we propose and analyze inexact randomized iterative methods for solving three closely related problems: a convex stochastic quadratic optimization problem, a best approximation problem and its dual, a concave quadratic maximization problem. We provide iteration complexity results under several assumptions on the inexactness error. Inexact variants of many popular and some more exotic methods, including randomized block Kaczmarz, randomized Gaussian Kaczmarz and randomized block coordinate descent, can be cast as special cases. Numerical experiments demonstrate the benefits of allowing inexactness. |
Tasks | |
Published | 2019-03-19 |
URL | http://arxiv.org/abs/1903.07971v1 |
http://arxiv.org/pdf/1903.07971v1.pdf | |
PWC | https://paperswithcode.com/paper/convergence-analysis-of-inexact-randomized |
Repo | |
Framework | |
What Else Can Fool Deep Learning? Addressing Color Constancy Errors on Deep Neural Network Performance
Title | What Else Can Fool Deep Learning? Addressing Color Constancy Errors on Deep Neural Network Performance |
Authors | Mahmoud Afifi, Michael S Brown |
Abstract | There is active research targeting local image manipulations that can fool deep neural networks (DNNs) into producing incorrect results. This paper examines a type of global image manipulation that can produce similar adverse effects. Specifically, we explore how strong color casts caused by incorrectly applied computational color constancy - referred to as white balance (WB) in photography - negatively impact the performance of DNNs targeting image segmentation and classification. In addition, we discuss how existing image augmentation methods used to improve the robustness of DNNs are not well suited for modeling WB errors. To address this problem, a novel augmentation method is proposed that can emulate accurate color constancy degradation. We also explore pre-processing training and testing images with a recent WB correction algorithm to reduce the effects of incorrectly white-balanced images. We examine both augmentation and pre-processing strategies on different datasets and demonstrate notable improvements on the CIFAR-10, CIFAR-100, and ADE20K datasets. |
Tasks | Color Constancy, Image Augmentation, Semantic Segmentation |
Published | 2019-12-15 |
URL | https://arxiv.org/abs/1912.06960v1 |
https://arxiv.org/pdf/1912.06960v1.pdf | |
PWC | https://paperswithcode.com/paper/what-else-can-fool-deep-learning-addressing-1 |
Repo | |
Framework | |
Latent Code and Text-based Generative Adversarial Networks for Soft-text Generation
Title | Latent Code and Text-based Generative Adversarial Networks for Soft-text Generation |
Authors | Md. Akmal Haidar, Mehdi Rezagholizadeh, Alan Do-Omri, Ahmad Rashid |
Abstract | Text generation with generative adversarial networks (GANs) can be divided into the text-based and code-based categories according to the type of signals used for discrimination. In this work, we introduce a novel text-based approach called Soft-GAN to effectively exploit GAN setup for text generation. We demonstrate how autoencoders (AEs) can be used for providing a continuous representation of sentences, which we will refer to as soft-text. This soft representation will be used in GAN discrimination to synthesize similar soft-texts. We also propose hybrid latent code and text-based GAN (LATEXT-GAN) approaches with one or more discriminators, in which a combination of the latent code and the soft-text is used for GAN discriminations. We perform a number of subjective and objective experiments on two well-known datasets (SNLI and Image COCO) to validate our techniques. We discuss the results using several evaluation metrics and show that the proposed techniques outperform the traditional GAN-based text-generation methods. |
Tasks | Text Generation |
Published | 2019-04-15 |
URL | http://arxiv.org/abs/1904.07293v2 |
http://arxiv.org/pdf/1904.07293v2.pdf | |
PWC | https://paperswithcode.com/paper/latent-code-and-text-based-generative |
Repo | |
Framework | |