January 31, 2020

2762 words 13 mins read

Paper Group ANR 94

DeepMark: One-Shot Clothing Detection. Balancing Domain Gap for Object Instance Detection. Solving a Class of Non-Convex Min-Max Games Using Iterative First Order Methods. Communication-Efficient Weighted Sampling and Quantile Summary for GBDT. A Compact Light Field Camera for Real-Time Depth Estimation. Learning Disentangled Representations of Sat …

DeepMark: One-Shot Clothing Detection


Title	DeepMark: One-Shot Clothing Detection
Authors	Alexey Sidnev, Alexey Trushkov, Maxim Kazakov, Ivan Korolev, Vladislav Sorokin
Abstract	The one-shot approach, DeepMark, for fast clothing detection as a modification of a multi-target network, CenterNet, is proposed in the paper. The state-of-the-art accuracy of 0.723 mAP for bounding box detection task and 0.532 mAP for landmark detection task on the DeepFashion2 Challenge dataset were achieved. The proposed architecture can be used effectively on the low-power devices.
Tasks
Published	2019-10-02
URL	https://arxiv.org/abs/1910.01225v1
PDF	https://arxiv.org/pdf/1910.01225v1.pdf
PWC	https://paperswithcode.com/paper/deepmark-one-shot-clothing-detection
Repo
Framework

Balancing Domain Gap for Object Instance Detection


Title	Balancing Domain Gap for Object Instance Detection
Authors	Woo-han Yun, Jaeyeon Lee, Jaehong Kim, Junmo Kim
Abstract	Object instance detection in cluttered indoor environment is a core functionality for service robots. We can readily build a detection system by following recent successful strategy of deep convolutional neural networks, if we have a large annotated dataset. However, it is hard to prepare such a huge dataset in instance detection problem where only small number of samples are available. This is one of main impediment to deploying an object detection system. To overcome this obstacle, many approaches to generate synthetic dataset have been proposed. These approaches confront the domain gap or reality gap problem stems from discrepancy between source domain (synthetic training dataset) and target domain (real test dataset). In this paper, we propose a simple approach to generate a synthetic dataset with minimum human effort. Especially, we identify that domain gaps of foreground and background are unbalanced and propose methods to balance these gaps. In the experiment, we verify that our methods help domain gaps to balance and improve the accuracy of object instance detection in cluttered indoor environment.
Tasks	Object Detection
Published	2019-09-26
URL	https://arxiv.org/abs/1909.11972v1
PDF	https://arxiv.org/pdf/1909.11972v1.pdf
PWC	https://paperswithcode.com/paper/balancing-domain-gap-for-object-instance
Repo
Framework

Solving a Class of Non-Convex Min-Max Games Using Iterative First Order Methods


Title	Solving a Class of Non-Convex Min-Max Games Using Iterative First Order Methods
Authors	Maher Nouiehed, Maziar Sanjabi, Tianjian Huang, Jason D. Lee, Meisam Razaviyayn
Abstract	Recent applications that arise in machine learning have surged significant interest in solving min-max saddle point games. This problem has been extensively studied in the convex-concave regime for which a global equilibrium solution can be computed efficiently. In this paper, we study the problem in the non-convex regime and show that an \varepsilon–first order stationary point of the game can be computed when one of the player’s objective can be optimized to global optimality efficiently. In particular, we first consider the case where the objective of one of the players satisfies the Polyak-{\L}ojasiewicz (PL) condition. For such a game, we show that a simple multi-step gradient descent-ascent algorithm finds an \varepsilon–first order stationary point of the problem in \widetilde{\mathcal{O}}(\varepsilon^{-2}) iterations. Then we show that our framework can also be applied to the case where the objective of the “max-player” is concave. In this case, we propose a multi-step gradient descent-ascent algorithm that finds an \varepsilon–first order stationary point of the game in \widetilde{\cal O}(\varepsilon^{-3.5}) iterations, which is the best known rate in the literature. We applied our algorithm to a fair classification problem of Fashion-MNIST dataset and observed that the proposed algorithm results in smoother training and better generalization.
Tasks
Published	2019-02-21
URL	https://arxiv.org/abs/1902.08297v3
PDF	https://arxiv.org/pdf/1902.08297v3.pdf
PWC	https://paperswithcode.com/paper/solving-a-class-of-non-convex-min-max-games
Repo
Framework

Communication-Efficient Weighted Sampling and Quantile Summary for GBDT


Title	Communication-Efficient Weighted Sampling and Quantile Summary for GBDT
Authors	Ziyue Huang, Ke Yi
Abstract	Gradient boosting decision tree (GBDT) is a powerful and widely-used machine learning model, which has achieved state-of-the-art performance in many academic areas and production environment. However, communication overhead is the main bottleneck in distributed training which can handle the massive data nowadays. In this paper, we propose two novel communication-efficient methods over distributed dataset to mitigate this problem, a weighted sampling approach by which we can estimate the information gain over a small subset efficiently, and distributed protocols for weighted quantile problem used in approximate tree learning.
Tasks
Published	2019-09-17
URL	https://arxiv.org/abs/1909.07633v1
PDF	https://arxiv.org/pdf/1909.07633v1.pdf
PWC	https://paperswithcode.com/paper/communication-efficient-weighted-sampling-and
Repo
Framework

A Compact Light Field Camera for Real-Time Depth Estimation


Title	A Compact Light Field Camera for Real-Time Depth Estimation
Authors	Yuriy Anisimov, Oliver Wasenmüller, Didier Stricker
Abstract	Depth cameras are utilized in many applications. Recently light field approaches are increasingly being used for depth computation. While these approaches demonstrate the technical feasibility, they can not be brought into real-world application, since they have both a high computation time as well as a large design. Exactly these two drawbacks are overcome in this paper. For the first time, we present a depth camera based on the light field principle, which provides real-time depth information as well as a compact design.
Tasks	Depth Estimation
Published	2019-07-25
URL	https://arxiv.org/abs/1907.10880v1
PDF	https://arxiv.org/pdf/1907.10880v1.pdf
PWC	https://paperswithcode.com/paper/a-compact-light-field-camera-for-real-time
Repo
Framework

Learning Disentangled Representations of Satellite Image Time Series


Title	Learning Disentangled Representations of Satellite Image Time Series
Authors	Eduardo Sanchez, Mathieu Serrurier, Mathias Ortner
Abstract	In this paper, we investigate how to learn a suitable representation of satellite image time series in an unsupervised manner by leveraging large amounts of unlabeled data. Additionally , we aim to disentangle the representation of time series into two representations: a shared representation that captures the common information between the images of a time series and an exclusive representation that contains the specific information of each image of the time series. To address these issues, we propose a model that combines a novel component called cross-domain autoencoders with the variational autoencoder (VAE) and generative ad-versarial network (GAN) methods. In order to learn disentangled representations of time series, our model learns the multimodal image-to-image translation task. We train our model using satellite image time series from the Sentinel-2 mission. Several experiments are carried out to evaluate the obtained representations. We show that these disentangled representations can be very useful to perform multiple tasks such as image classification, image retrieval, image segmentation and change detection.
Tasks	Image Classification, Image Retrieval, Image-to-Image Translation, Semantic Segmentation, Time Series
Published	2019-03-21
URL	http://arxiv.org/abs/1903.08863v1
PDF	http://arxiv.org/pdf/1903.08863v1.pdf
PWC	https://paperswithcode.com/paper/learning-disentangled-representations-of-2
Repo
Framework

NLPExplorer: Exploring the Universe of NLP Papers


Title	NLPExplorer: Exploring the Universe of NLP Papers
Authors	Monarch Parmar, Naman Jain, Pranjali Jain, P Jayakrishna Sahit, Soham Pachpande, Shruti Singh, Mayank Singh
Abstract	Understanding the current research trends, problems, and their innovative solutions remains a bottleneck due to the ever-increasing volume of scientific articles. In this paper, we propose NLPExplorer, a completely automatic portal for indexing, searching, and visualizing Natural Language Processing (NLP) research volume. NLPExplorer presents interesting insights from papers, authors, venues, and topics. In contrast to previous topic modelling based approaches, we manually curate five course-grained non-exclusive topical categories namely Linguistic Target (Syntax, Discourse, etc.), Tasks (Tagging, Summarization, etc.), Approaches (unsupervised, supervised, etc.), Languages (English, Chinese,etc.) and Dataset types (news, clinical notes, etc.). Some of the novel features include a list of young popular authors, popular URLs, and datasets, a list of topically diverse papers and recent popular papers. Also, it provides temporal statistics such as yearwise popularity of topics, datasets, and seminal papers. To facilitate future research and system development, we make all the processed datasets accessible through API calls. The current system is available at http://nlpexplorer.org.
Tasks
Published	2019-10-16
URL	https://arxiv.org/abs/1910.07351v1
PDF	https://arxiv.org/pdf/1910.07351v1.pdf
PWC	https://paperswithcode.com/paper/nlpexplorer-exploring-the-universe-of-nlp
Repo
Framework

The variational infomax autoencoder


Title	The variational infomax autoencoder
Authors	Vincenzo Crescimanna, Bruce Graham
Abstract	We propose the Variational InfoMax AutoEncoder (VIMAE), a method to train a generative model, maximizing the variational lower bound of the mutual information between the visible data and the hidden representation, maintaining bounded the capacity of the network. In the paper we investigate the capacity role in a neural network and deduce that a small capacity network tends to learn a more robust and disentangled representation than an high capacity one. Such observations are confirmed by the computational experiments.
Tasks
Published	2019-05-25
URL	https://arxiv.org/abs/1905.10549v1
PDF	https://arxiv.org/pdf/1905.10549v1.pdf
PWC	https://paperswithcode.com/paper/the-variational-infomax-autoencoder
Repo
Framework

Automated Analysis, Reporting, and Archiving for Robotic Nondestructive Assay of Holdup Deposits


Title	Automated Analysis, Reporting, and Archiving for Robotic Nondestructive Assay of Holdup Deposits
Authors	Heather Jones, Siri Maley, Kenji Yonekawa, Mohammadreza Mousaei, J. David Yesso, David Kohanbash, William Whittaker
Abstract	To decommission deactivated gaseous diffusion enrichment facilities, miles of contaminated pipe must be measured. The current method requires thousands of manual measurements, repeated manual data transcription, and months of manual analysis. The Pipe Crawling Activity Measurement System (PCAMS), developed by Carnegie Mellon University and in commissioning for use at the DOE Portsmouth Gaseous Diffusion Enrichment Facility, uses a robot to measure Uranium-235 from inside pipes and automatically log the data. Radiation measurements, as well as imagery, geometric modeling, and precise measurement positioning data are digitally transferred to the PCAMS server. On the server, data can be automatically processed in minutes and summarized for analyst review. Measurement reports are auto-generated with the push of a button. A database specially-configured to hold heterogeneous data such as spectra, images, and robot trajectories serves as archive. This paper outlines the features and design of the PCAMS Post-Processing Software, currently in commissioning for use at the Portsmouth Gaseous Diffusion Enrichment Facility. The analysis process, the analyst interface to the system, and the content of auto-generated reports are each described. Example pipe-interior geometric surface models, illustration of how key report features apply in operational runs, and user feedback are discussed.
Tasks
Published	2019-01-29
URL	http://arxiv.org/abs/1901.10795v1
PDF	http://arxiv.org/pdf/1901.10795v1.pdf
PWC	https://paperswithcode.com/paper/automated-analysis-reporting-and-archiving
Repo
Framework

Neural Module Networks for Reasoning over Text


Title	Neural Module Networks for Reasoning over Text
Authors	Nitish Gupta, Kevin Lin, Dan Roth, Sameer Singh, Matt Gardner
Abstract	Answering compositional questions that require multiple steps of reasoning against text is challenging, especially when they involve discrete, symbolic operations. Neural module networks (NMNs) learn to parse such questions as executable programs composed of learnable modules, performing well on synthetic visual QA domains. However, we find that it is challenging to learn these models for non-synthetic questions on open-domain text, where a model needs to deal with the diversity of natural language and perform a broader range of reasoning. We extend NMNs by: (a) introducing modules that reason over a paragraph of text, performing symbolic reasoning (such as arithmetic, sorting, counting) over numbers and dates in a probabilistic and differentiable manner; and (b) proposing an unsupervised auxiliary loss to help extract arguments associated with the events in text. Additionally, we show that a limited amount of heuristically-obtained question program and intermediate module output supervision provides sufficient inductive bias for accurate learning. Our proposed model significantly outperforms state-of-the-art models on a subset of the DROP dataset that poses a variety of reasoning challenges that are covered by our modules.
Tasks
Published	2019-12-10
URL	https://arxiv.org/abs/1912.04971v2
PDF	https://arxiv.org/pdf/1912.04971v2.pdf
PWC	https://paperswithcode.com/paper/neural-module-networks-for-reasoning-over-1
Repo
Framework

Overcoming Practical Issues of Deep Active Learning and its Applications on Named Entity Recognition


Title	Overcoming Practical Issues of Deep Active Learning and its Applications on Named Entity Recognition
Authors	Haw-Shiuan Chang, Shankar Vembu, Sunil Mohan, Rheeya Uppaal, Andrew McCallum
Abstract	Existing deep active learning algorithms achieve impressive sampling efficiency on natural language processing tasks. However, they exhibit several weaknesses in practice, including (a) inability to use uncertainty sampling with black-box models, (b) lack of robustness to noise in labeling, (c) lack of transparency. In response, we propose a transparent batch active sampling framework by estimating the error decay curves of multiple feature-defined subsets of the data. Experiments on four named entity recognition (NER) tasks demonstrate that the proposed methods significantly outperform diversification-based methods for black-box NER taggers and can make the sampling process more robust to labeling noise when combined with uncertainty-based methods. Furthermore, the analysis of experimental results sheds light on the weaknesses of different active sampling strategies, and when traditional uncertainty-based or diversification-based methods can be expected to work well.
Tasks	Active Learning, Named Entity Recognition
Published	2019-11-17
URL	https://arxiv.org/abs/1911.07335v1
PDF	https://arxiv.org/pdf/1911.07335v1.pdf
PWC	https://paperswithcode.com/paper/overcoming-practical-issues-of-deep-active
Repo
Framework

Faster and Simpler SNN Simulation with Work Queues


Title	Faster and Simpler SNN Simulation with Work Queues
Authors	Dennis Bautembach, Iason Oikonomidis, Nikolaos Kyriazis, Antonis Argyros
Abstract	We present a clock-driven Spiking Neural Network simulator which is up to 3x faster than the state of the art while, at the same time, being more general and requiring less programming effort on both the user’s and maintainer’s side. This is made possible by designing our pipeline around “work queues” which act as interfaces between stages and greatly reduce implementation complexity. We evaluate our work using three well-established SNN models on a series of benchmarks.
Tasks
Published	2019-12-16
URL	https://arxiv.org/abs/1912.07423v2
PDF	https://arxiv.org/pdf/1912.07423v2.pdf
PWC	https://paperswithcode.com/paper/faster-and-simpler-snn-simulation-with-work
Repo
Framework

Active learning in the geometric block model


Title	Active learning in the geometric block model
Authors	Eli Chien, Antonia Maria Tulino, Jaime Llorca
Abstract	The geometric block model is a recently proposed generative model for random graphs that is able to capture the inherent geometric properties of many community detection problems, providing more accurate characterizations of practical community structures compared with the popular stochastic block model. Galhotra et al. recently proposed a motif-counting algorithm for unsupervised community detection in the geometric block model that is proved to be near-optimal. They also characterized the regimes of the model parameters for which the proposed algorithm can achieve exact recovery. In this work, we initiate the study of active learning in the geometric block model. That is, we are interested in the problem of exactly recovering the community structure of random graphs following the geometric block model under arbitrary model parameters, by possibly querying the labels of a limited number of chosen nodes. We propose two active learning algorithms that combine the idea of motif-counting with two different label query policies. Our main contribution is to show that sampling the labels of a vanishingly small fraction of nodes (sub-linear in the total number of nodes) is sufficient to achieve exact recovery in the regimes under which the state-of-the-art unsupervised method fails. We validate the superior performance of our algorithms via numerical simulations on both real and synthetic datasets.
Tasks	Active Learning, Community Detection
Published	2019-11-15
URL	https://arxiv.org/abs/1912.06570v1
PDF	https://arxiv.org/pdf/1912.06570v1.pdf
PWC	https://paperswithcode.com/paper/active-learning-in-the-geometric-block-model
Repo
Framework

Procedural Reasoning Networks for Understanding Multimodal Procedures


Title	Procedural Reasoning Networks for Understanding Multimodal Procedures
Authors	Mustafa Sercan Amac, Semih Yagcioglu, Aykut Erdem, Erkut Erdem
Abstract	This paper addresses the problem of comprehending procedural commonsense knowledge. This is a challenging task as it requires identifying key entities, keeping track of their state changes, and understanding temporal and causal relations. Contrary to most of the previous work, in this study, we do not rely on strong inductive bias and explore the question of how multimodality can be exploited to provide a complementary semantic signal. Towards this end, we introduce a new entity-aware neural comprehension model augmented with external relational memory units. Our model learns to dynamically update entity states in relation to each other while reading the text instructions. Our experimental analysis on the visual reasoning tasks in the recently proposed RecipeQA dataset reveals that our approach improves the accuracy of the previously reported models by a large margin. Moreover, we find that our model learns effective dynamic representations of entities even though we do not use any supervision at the level of entity states.
Tasks	Visual Reasoning
Published	2019-09-19
URL	https://arxiv.org/abs/1909.08859v1
PDF	https://arxiv.org/pdf/1909.08859v1.pdf
PWC	https://paperswithcode.com/paper/procedural-reasoning-networks-for
Repo
Framework

Coincidence, Categorization, and Consolidation: Learning to Recognize Sounds with Minimal Supervision


Title	Coincidence, Categorization, and Consolidation: Learning to Recognize Sounds with Minimal Supervision
Authors	Aren Jansen, Daniel P. W. Ellis, Shawn Hershey, R. Channing Moore, Manoj Plakal, Ashok C. Popat, Rif A. Saurous
Abstract	Humans do not acquire perceptual abilities in the way we train machines. While machine learning algorithms typically operate on large collections of randomly-chosen, explicitly-labeled examples, human acquisition relies more heavily on multimodal unsupervised learning (as infants) and active learning (as children). With this motivation, we present a learning framework for sound representation and recognition that combines (i) a self-supervised objective based on a general notion of unimodal and cross-modal coincidence, (ii) a clustering objective that reflects our need to impose categorical structure on our experiences, and (iii) a cluster-based active learning procedure that solicits targeted weak supervision to consolidate categories into relevant semantic classes. By training a combined sound embedding/clustering/classification network according to these criteria, we achieve a new state-of-the-art unsupervised audio representation and demonstrate up to a 20-fold reduction in the number of labels required to reach a desired classification performance.
Tasks	Active Learning
Published	2019-11-14
URL	https://arxiv.org/abs/1911.05894v1
PDF	https://arxiv.org/pdf/1911.05894v1.pdf
PWC	https://paperswithcode.com/paper/coincidence-categorization-and-consolidation
Repo
Framework