Paper Group ANR 341
Learning 3D Part Assembly from a Single Image. Iterative Data Programming for Expanding Text Classification Corpora. Identifying Notable News Stories. Deep Learning Enabled Uncorrelated Space Observation Association. Distributed Stochastic Algorithms for High-rate Streaming Principal Component Analysis. ScrabbleGAN: Semi-Supervised Varying Length H …
Learning 3D Part Assembly from a Single Image
Title | Learning 3D Part Assembly from a Single Image |
Authors | Yichen Li, Kaichun Mo, Lin Shao, Minhyuk Sung, Leonidas Guibas |
Abstract | Autonomous assembly is a crucial capability for robots in many applications. For this task, several problems such as obstacle avoidance, motion planning, and actuator control have been extensively studied in robotics. However, when it comes to task specification, the space of possibilities remains underexplored. Towards this end, we introduce a novel problem, single-image-guided 3D part assembly, along with a learningbased solution. We study this problem in the setting of furniture assembly from a given complete set of parts and a single image depicting the entire assembled object. Multiple challenges exist in this setting, including handling ambiguity among parts (e.g., slats in a chair back and leg stretchers) and 3D pose prediction for parts and part subassemblies, whether visible or occluded. We address these issues by proposing a two-module pipeline that leverages strong 2D-3D correspondences and assembly-oriented graph message-passing to infer part relationships. In experiments with a PartNet-based synthetic benchmark, we demonstrate the effectiveness of our framework as compared with three baseline approaches. |
Tasks | Motion Planning, Pose Prediction |
Published | 2020-03-21 |
URL | https://arxiv.org/abs/2003.09754v2 |
https://arxiv.org/pdf/2003.09754v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-3d-part-assembly-from-a-single-image |
Repo | |
Framework | |
Iterative Data Programming for Expanding Text Classification Corpora
Title | Iterative Data Programming for Expanding Text Classification Corpora |
Authors | Neil Mallinar, Abhishek Shah, Tin Kam Ho, Rajendra Ugrani, Ayush Gupta |
Abstract | Real-world text classification tasks often require many labeled training examples that are expensive to obtain. Recent advancements in machine teaching, specifically the data programming paradigm, facilitate the creation of training data sets quickly via a general framework for building weak models, also known as labeling functions, and denoising them through ensemble learning techniques. We present a fast, simple data programming method for augmenting text data sets by generating neighborhood-based weak models with minimal supervision. Furthermore, our method employs an iterative procedure to identify sparsely distributed examples from large volumes of unlabeled data. The iterative data programming techniques improve newer weak models as more labeled data is confirmed with human-in-loop. We show empirical results on sentence classification tasks, including those from a task of improving intent recognition in conversational agents. |
Tasks | Denoising, Sentence Classification, Text Classification |
Published | 2020-02-04 |
URL | https://arxiv.org/abs/2002.01412v1 |
https://arxiv.org/pdf/2002.01412v1.pdf | |
PWC | https://paperswithcode.com/paper/iterative-data-programming-for-expanding-text |
Repo | |
Framework | |
Identifying Notable News Stories
Title | Identifying Notable News Stories |
Authors | Antonia Saravanou, Giorgio Stefanoni, Edgar Meij |
Abstract | The volume of news content has increased significantly in recent years and systems to process and deliver this information in an automated fashion at scale are becoming increasingly prevalent. One critical component that is required in such systems is a method to automatically determine how notable a certain news story is, in order to prioritize these stories during delivery. One way to do so is to compare each story in a stream of news stories to a notable event. In other words, the problem of detecting notable news can be defined as a ranking task; given a trusted source of notable events and a stream of candidate news stories, we aim to answer the question: “Which of the candidate news stories is most similar to the notable one?". We employ different combinations of features and learning to rank (LTR) models and gather relevance labels using crowdsourcing. In our approach, we use structured representations of candidate news stories (triples) and we link them to corresponding entities. Our evaluation shows that the features in our proposed method outperform standard ranking methods, and that the trained model generalizes well to unseen news stories. |
Tasks | Learning-To-Rank |
Published | 2020-03-16 |
URL | https://arxiv.org/abs/2003.07461v1 |
https://arxiv.org/pdf/2003.07461v1.pdf | |
PWC | https://paperswithcode.com/paper/identifying-notable-news-stories |
Repo | |
Framework | |
Deep Learning Enabled Uncorrelated Space Observation Association
Title | Deep Learning Enabled Uncorrelated Space Observation Association |
Authors | Jacob J Decoto, David RC Dayton |
Abstract | Uncorrelated optical space observation association represents a classic needle in a haystack problem. The objective being to find small groups of observations that are likely of the same resident space objects (RSOs) from amongst the much larger population of all uncorrelated observations. These observations being potentially widely disparate both temporally and with respect to the observing sensor position. By training on a large representative data set this paper shows that a deep learning enabled learned model with no encoded knowledge of physics or orbital mechanics can learn a model for identifying observations of common objects. When presented with balanced input sets of 50% matching observation pairs the learned model was able to correctly identify if the observation pairs were of the same RSO 83.1% of the time. The resulting learned model is then used in conjunction with a search algorithm on an unbalanced demonstration set of 1,000 disparate simulated uncorrelated observations and is shown to be able to successfully identify true three observation sets representing 111 out of 142 objects in the population. With most objects being identified in multiple three observation triplets. This is accomplished while only exploring 0.06% of the search space of 1.66e8 possible unique triplet combinations. |
Tasks | |
Published | 2020-01-09 |
URL | https://arxiv.org/abs/2001.05855v1 |
https://arxiv.org/pdf/2001.05855v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-enabled-uncorrelated-space |
Repo | |
Framework | |
Distributed Stochastic Algorithms for High-rate Streaming Principal Component Analysis
Title | Distributed Stochastic Algorithms for High-rate Streaming Principal Component Analysis |
Authors | Haroon Raja, Waheed U. Bajwa |
Abstract | This paper considers the problem of estimating the principal eigenvector of a covariance matrix from independent and identically distributed data samples in streaming settings. The streaming rate of data in many contemporary applications can be high enough that a single processor cannot finish an iteration of existing methods for eigenvector estimation before a new sample arrives. This paper formulates and analyzes a distributed variant of the classical Krasulina’s method (D-Krasulina) that can keep up with the high streaming rate of data by distributing the computational load across multiple processing nodes. The analysis shows that—under appropriate conditions—D-Krasulina converges to the principal eigenvector in an order-wise optimal manner; i.e., after receiving $M$ samples across all nodes, its estimation error can be $O(1/M)$. In order to reduce the network communication overhead, the paper also develops and analyzes a mini-batch extension of D-Krasulina, which is termed DM-Krasulina. The analysis of DM-Krasulina shows that it can also achieve order-optimal estimation error rates under appropriate conditions, even when some samples have to be discarded within the network due to communication latency. Finally, experiments are performed over synthetic and real-world data to validate the convergence behaviors of D-Krasulina and DM-Krasulina in high-rate streaming settings. |
Tasks | |
Published | 2020-01-04 |
URL | https://arxiv.org/abs/2001.01017v1 |
https://arxiv.org/pdf/2001.01017v1.pdf | |
PWC | https://paperswithcode.com/paper/distributed-stochastic-algorithms-for-high |
Repo | |
Framework | |
ScrabbleGAN: Semi-Supervised Varying Length Handwritten Text Generation
Title | ScrabbleGAN: Semi-Supervised Varying Length Handwritten Text Generation |
Authors | Sharon Fogel, Hadar Averbuch-Elor, Sarel Cohen, Shai Mazor, Roee Litman |
Abstract | Optical character recognition (OCR) systems performance have improved significantly in the deep learning era. This is especially true for handwritten text recognition (HTR), where each author has a unique style, unlike printed text, where the variation is smaller by design. That said, deep learning based HTR is limited, as in every other task, by the number of training examples. Gathering data is a challenging and costly task, and even more so, the labeling task that follows, of which we focus here. One possible approach to reduce the burden of data annotation is semi-supervised learning. Semi supervised methods use, in addition to labeled data, some unlabeled samples to improve performance, compared to fully supervised ones. Consequently, such methods may adapt to unseen images during test time. We present ScrabbleGAN, a semi-supervised approach to synthesize handwritten text images that are versatile both in style and lexicon. ScrabbleGAN relies on a novel generative model which can generate images of words with an arbitrary length. We show how to operate our approach in a semi-supervised manner, enjoying the aforementioned benefits such as performance boost over state of the art supervised HTR. Furthermore, our generator can manipulate the resulting text style. This allows us to change, for instance, whether the text is cursive, or how thin is the pen stroke. |
Tasks | Optical Character Recognition, Text Generation |
Published | 2020-03-23 |
URL | https://arxiv.org/abs/2003.10557v1 |
https://arxiv.org/pdf/2003.10557v1.pdf | |
PWC | https://paperswithcode.com/paper/scrabblegan-semi-supervised-varying-length |
Repo | |
Framework | |
Multistage Curvilinear Coordinate Transform Based Document Image Dewarping using a Novel Quality Estimator
Title | Multistage Curvilinear Coordinate Transform Based Document Image Dewarping using a Novel Quality Estimator |
Authors | Tanmoy Dasgupta, Nibaran Das, Mita Nasipuri |
Abstract | The present work demonstrates a fast and improved technique for dewarping nonlinearly warped document images. The images are first dewarped at the page-level by estimating optimum inverse projections using curvilinear homography. The quality of the process is then estimated by evaluating a set of metrics related to the characteristics of the text lines and rectilinear objects for measuring parallelism, orthogonality, etc. These are designed specifically to estimate the quality of the dewarping process without the need of any ground truth. If the quality is estimated to be unsatisfactory, the page-level dewarping process is repeated with finer approximations. This is followed by a line-level dewarping process that makes granular corrections to the warps in individual text-lines. The methodology has been tested on the CBDAR 2007 / IUPR 2011 document image dewarping dataset and is seen to yield the best OCR accuracy in the shortest amount of time, till date. The usefulness of the methodology has also been evaluated on the DocUNet 2018 dataset with some minor tweaks, and is seen to produce comparable results. |
Tasks | Optical Character Recognition |
Published | 2020-03-15 |
URL | https://arxiv.org/abs/2003.06872v1 |
https://arxiv.org/pdf/2003.06872v1.pdf | |
PWC | https://paperswithcode.com/paper/multistage-curvilinear-coordinate-transform |
Repo | |
Framework | |
Attacking Optical Character Recognition (OCR) Systems with Adversarial Watermarks
Title | Attacking Optical Character Recognition (OCR) Systems with Adversarial Watermarks |
Authors | Lu Chen, Wei Xu |
Abstract | Optical character recognition (OCR) is widely applied in real applications serving as a key preprocessing tool. The adoption of deep neural network (DNN) in OCR results in the vulnerability against adversarial examples which are crafted to mislead the output of the threat model. Different from vanilla colorful images, images of printed text have clear backgrounds usually. However, adversarial examples generated by most of the existing adversarial attacks are unnatural and pollute the background severely. To address this issue, we propose a watermark attack method to produce natural distortion that is in the disguise of watermarks and evade human eyes’ detection. Experimental results show that watermark attacks can yield a set of natural adversarial examples attached with watermarks and attain similar attack performance to the state-of-the-art methods in different attack scenarios. |
Tasks | Optical Character Recognition |
Published | 2020-02-08 |
URL | https://arxiv.org/abs/2002.03095v1 |
https://arxiv.org/pdf/2002.03095v1.pdf | |
PWC | https://paperswithcode.com/paper/attacking-optical-character-recognition-ocr |
Repo | |
Framework | |
Multi-Agent Reinforcement Learning as a Computational Tool for Language Evolution Research: Historical Context and Future Challenges
Title | Multi-Agent Reinforcement Learning as a Computational Tool for Language Evolution Research: Historical Context and Future Challenges |
Authors | Clément Moulin-Frier, Pierre-Yves Oudeyer |
Abstract | Computational models of emergent communication in agent populations are currently gaining interest in the machine learning community due to recent advances in Multi-Agent Reinforcement Learning (MARL). Current contributions are however still relatively disconnected from the earlier theoretical and computational literature aiming at understanding how language might have emerged from a prelinguistic substance. The goal of this paper is to position recent MARL contributions within the historical context of language evolution research, as well as to extract from this theoretical and computational background a few challenges for future research. |
Tasks | Multi-agent Reinforcement Learning |
Published | 2020-02-20 |
URL | https://arxiv.org/abs/2002.08878v1 |
https://arxiv.org/pdf/2002.08878v1.pdf | |
PWC | https://paperswithcode.com/paper/multi-agent-reinforcement-learning-as-a |
Repo | |
Framework | |
A general framework for scientifically inspired explanations in AI
Title | A general framework for scientifically inspired explanations in AI |
Authors | David Tuckey, Alessandra Russo, Krysia Broda |
Abstract | Explainability in AI is gaining attention in the computer science community in response to the increasing success of deep learning and the important need of justifying how such systems make predictions in life-critical applications. The focus of explainability in AI has predominantly been on trying to gain insights into how machine learning systems function by exploring relationships between input data and predicted outcomes or by extracting simpler interpretable models. Through literature surveys of philosophy and social science, authors have highlighted the sharp difference between these generated explanations and human-made explanations and claimed that current explanations in AI do not take into account the complexity of human interaction to allow for effective information passing to not-expert users. In this paper we instantiate the concept of structure of scientific explanation as the theoretical underpinning for a general framework in which explanations for AI systems can be implemented. This framework aims to provide the tools to build a “mental-model” of any AI system so that the interaction with the user can provide information on demand and be closer to the nature of human-made explanations. We illustrate how we can utilize this framework through two very different examples: an artificial neural network and a Prolog solver and we provide a possible implementation for both examples. |
Tasks | |
Published | 2020-03-02 |
URL | https://arxiv.org/abs/2003.00749v1 |
https://arxiv.org/pdf/2003.00749v1.pdf | |
PWC | https://paperswithcode.com/paper/a-general-framework-for-scientifically |
Repo | |
Framework | |
ENIGMA Anonymous: Symbol-Independent Inference Guiding Machine (system description)
Title | ENIGMA Anonymous: Symbol-Independent Inference Guiding Machine (system description) |
Authors | Jan Jakubův, Karel Chvalovský, Miroslav Olšák, Bartosz Piotrowski, Martin Suda, Josef Urban |
Abstract | We describe an implementation of gradient boosting and neural guidance of saturation-style automated theorem provers that does not depend on consistent symbol names across problems. For the gradient-boosting guidance, we manually create abstracted features by considering arity-based encodings of formulas. For the neural guidance, we use symbol-independent graph neural networks and their embedding of the terms and clauses. The two methods are efficiently implemented in the E prover and its ENIGMA learning-guided framework and evaluated on the MPTP large-theory benchmark. Both methods are shown to achieve comparable real-time performance to state-of-the-art symbol-based methods. |
Tasks | |
Published | 2020-02-13 |
URL | https://arxiv.org/abs/2002.05406v1 |
https://arxiv.org/pdf/2002.05406v1.pdf | |
PWC | https://paperswithcode.com/paper/enigma-anonymous-symbol-independent-inference |
Repo | |
Framework | |
First Order Methods take Exponential Time to Converge to Global Minimizers of Non-Convex Functions
Title | First Order Methods take Exponential Time to Converge to Global Minimizers of Non-Convex Functions |
Authors | Krishna Reddy Kesari, Jean Honorio |
Abstract | Machine learning algorithms typically perform optimization over a class of non-convex functions. In this work, we provide bounds on the fundamental hardness of identifying the global minimizer of a non convex function. Specifically, we design a family of parametrized non-convex functions and employ statistical lower bounds for parameter estimation. We show that the parameter estimation problem is equivalent to the problem of function identification in the given family. We then claim that non convex optimization is at least as hard as function identification. Jointly, we prove that any first order method can take exponential time to converge to a global minimizer. |
Tasks | |
Published | 2020-02-28 |
URL | https://arxiv.org/abs/2002.12911v1 |
https://arxiv.org/pdf/2002.12911v1.pdf | |
PWC | https://paperswithcode.com/paper/first-order-methods-take-exponential-time-to |
Repo | |
Framework | |
Weakly Supervised Learning Meets Ride-Sharing User Experience Enhancement
Title | Weakly Supervised Learning Meets Ride-Sharing User Experience Enhancement |
Authors | Lan-Zhe Guo, Feng Kuang, Zhang-Xun Liu, Yu-Feng Li, Nan Ma, Xiao-Hu Qie |
Abstract | Weakly supervised learning aims at coping with scarce labeled data. Previous weakly supervised studies typically assume that there is only one kind of weak supervision in data. In many applications, however, raw data usually contains more than one kind of weak supervision at the same time. For example, in user experience enhancement from Didi, one of the largest online ride-sharing platforms, the ride comment data contains severe label noise (due to the subjective factors of passengers) and severe label distribution bias (due to the sampling bias). We call such a problem as “compound weakly supervised learning”. In this paper, we propose the CWSL method to address this problem based on Didi ride-sharing comment data. Specifically, an instance reweighting strategy is employed to cope with severe label noise in comment data, where the weights for harmful noisy instances are small. Robust criteria like AUC rather than accuracy and the validation performance are optimized for the correction of biased data label. Alternating optimization and stochastic gradient methods accelerate the optimization on large-scale data. Experiments on Didi ride-sharing comment data clearly validate the effectiveness. We hope this work may shed some light on applying weakly supervised learning to complex real situations. |
Tasks | |
Published | 2020-01-19 |
URL | https://arxiv.org/abs/2001.09027v1 |
https://arxiv.org/pdf/2001.09027v1.pdf | |
PWC | https://paperswithcode.com/paper/weakly-supervised-learning-meets-ride-sharing |
Repo | |
Framework | |
Critical Point-Finding Methods Reveal Gradient-Flat Regions of Deep Network Losses
Title | Critical Point-Finding Methods Reveal Gradient-Flat Regions of Deep Network Losses |
Authors | Charles G. Frye, James Simon, Neha S. Wadia, Andrew Ligeralde, Michael R. DeWeese, Kristofer E. Bouchard |
Abstract | Despite the fact that the loss functions of deep neural networks are highly non-convex, gradient-based optimization algorithms converge to approximately the same performance from many random initial points. One thread of work has focused on explaining this phenomenon by characterizing the local curvature near critical points of the loss function, where the gradients are near zero, and demonstrating that neural network losses enjoy a no-bad-local-minima property and an abundance of saddle points. We report here that the methods used to find these putative critical points suffer from a bad local minima problem of their own: they often converge to or pass through regions where the gradient norm has a stationary point. We call these gradient-flat regions, since they arise when the gradient is approximately in the kernel of the Hessian, such that the loss is locally approximately linear, or flat, in the direction of the gradient. We describe how the presence of these regions necessitates care in both interpreting past results that claimed to find critical points of neural network losses and in designing second-order methods for optimizing neural networks. |
Tasks | |
Published | 2020-03-23 |
URL | https://arxiv.org/abs/2003.10397v1 |
https://arxiv.org/pdf/2003.10397v1.pdf | |
PWC | https://paperswithcode.com/paper/critical-point-finding-methods-reveal |
Repo | |
Framework | |
Object-Centric Image Generation from Layouts
Title | Object-Centric Image Generation from Layouts |
Authors | Tristan Sylvain, Pengchuan Zhang, Yoshua Bengio, R Devon Hjelm, Shikhar Sharma |
Abstract | Despite recent impressive results on single-object and single-domain image generation, the generation of complex scenes with multiple objects remains challenging. In this paper, we start with the idea that a model must be able to understand individual objects and relationships between objects in order to generate complex scenes well. Our layout-to-image-generation method, which we call Object-Centric Generative Adversarial Network (or OC-GAN), relies on a novel Scene-Graph Similarity Module (SGSM). The SGSM learns representations of the spatial relationships between objects in the scene, which lead to our model’s improved layout-fidelity. We also propose changes to the conditioning mechanism of the generator that enhance its object instance-awareness. Apart from improving image quality, our contributions mitigate two failure modes in previous approaches: (1) spurious objects being generated without corresponding bounding boxes in the layout, and (2) overlapping bounding boxes in the layout leading to merged objects in images. Extensive quantitative evaluation and ablation studies demonstrate the impact of our contributions, with our model outperforming previous state-of-the-art approaches on both the COCO-Stuff and Visual Genome datasets. Finally, we address an important limitation of evaluation metrics used in previous works by introducing SceneFID – an object-centric adaptation of the popular Fr{'e}chet Inception Distance metric, that is better suited for multi-object images. |
Tasks | Image Generation, Layout-to-Image Generation |
Published | 2020-03-16 |
URL | https://arxiv.org/abs/2003.07449v1 |
https://arxiv.org/pdf/2003.07449v1.pdf | |
PWC | https://paperswithcode.com/paper/object-centric-image-generation-from-layouts |
Repo | |
Framework | |