Paper Group ANR 94
DeepMark: One-Shot Clothing Detection. Balancing Domain Gap for Object Instance Detection. Solving a Class of Non-Convex Min-Max Games Using Iterative First Order Methods. Communication-Efficient Weighted Sampling and Quantile Summary for GBDT. A Compact Light Field Camera for Real-Time Depth Estimation. Learning Disentangled Representations of Sat …
DeepMark: One-Shot Clothing Detection
Title | DeepMark: One-Shot Clothing Detection |
Authors | Alexey Sidnev, Alexey Trushkov, Maxim Kazakov, Ivan Korolev, Vladislav Sorokin |
Abstract | The one-shot approach, DeepMark, for fast clothing detection as a modification of a multi-target network, CenterNet, is proposed in the paper. The state-of-the-art accuracy of 0.723 mAP for bounding box detection task and 0.532 mAP for landmark detection task on the DeepFashion2 Challenge dataset were achieved. The proposed architecture can be used effectively on the low-power devices. |
Tasks | |
Published | 2019-10-02 |
URL | https://arxiv.org/abs/1910.01225v1 |
https://arxiv.org/pdf/1910.01225v1.pdf | |
PWC | https://paperswithcode.com/paper/deepmark-one-shot-clothing-detection |
Repo | |
Framework | |
Balancing Domain Gap for Object Instance Detection
Title | Balancing Domain Gap for Object Instance Detection |
Authors | Woo-han Yun, Jaeyeon Lee, Jaehong Kim, Junmo Kim |
Abstract | Object instance detection in cluttered indoor environment is a core functionality for service robots. We can readily build a detection system by following recent successful strategy of deep convolutional neural networks, if we have a large annotated dataset. However, it is hard to prepare such a huge dataset in instance detection problem where only small number of samples are available. This is one of main impediment to deploying an object detection system. To overcome this obstacle, many approaches to generate synthetic dataset have been proposed. These approaches confront the domain gap or reality gap problem stems from discrepancy between source domain (synthetic training dataset) and target domain (real test dataset). In this paper, we propose a simple approach to generate a synthetic dataset with minimum human effort. Especially, we identify that domain gaps of foreground and background are unbalanced and propose methods to balance these gaps. In the experiment, we verify that our methods help domain gaps to balance and improve the accuracy of object instance detection in cluttered indoor environment. |
Tasks | Object Detection |
Published | 2019-09-26 |
URL | https://arxiv.org/abs/1909.11972v1 |
https://arxiv.org/pdf/1909.11972v1.pdf | |
PWC | https://paperswithcode.com/paper/balancing-domain-gap-for-object-instance |
Repo | |
Framework | |
Solving a Class of Non-Convex Min-Max Games Using Iterative First Order Methods
Title | Solving a Class of Non-Convex Min-Max Games Using Iterative First Order Methods |
Authors | Maher Nouiehed, Maziar Sanjabi, Tianjian Huang, Jason D. Lee, Meisam Razaviyayn |
Abstract | Recent applications that arise in machine learning have surged significant interest in solving min-max saddle point games. This problem has been extensively studied in the convex-concave regime for which a global equilibrium solution can be computed efficiently. In this paper, we study the problem in the non-convex regime and show that an \varepsilon–first order stationary point of the game can be computed when one of the player’s objective can be optimized to global optimality efficiently. In particular, we first consider the case where the objective of one of the players satisfies the Polyak-{\L}ojasiewicz (PL) condition. For such a game, we show that a simple multi-step gradient descent-ascent algorithm finds an \varepsilon–first order stationary point of the problem in \widetilde{\mathcal{O}}(\varepsilon^{-2}) iterations. Then we show that our framework can also be applied to the case where the objective of the “max-player” is concave. In this case, we propose a multi-step gradient descent-ascent algorithm that finds an \varepsilon–first order stationary point of the game in \widetilde{\cal O}(\varepsilon^{-3.5}) iterations, which is the best known rate in the literature. We applied our algorithm to a fair classification problem of Fashion-MNIST dataset and observed that the proposed algorithm results in smoother training and better generalization. |
Tasks | |
Published | 2019-02-21 |
URL | https://arxiv.org/abs/1902.08297v3 |
https://arxiv.org/pdf/1902.08297v3.pdf | |
PWC | https://paperswithcode.com/paper/solving-a-class-of-non-convex-min-max-games |
Repo | |
Framework | |
Communication-Efficient Weighted Sampling and Quantile Summary for GBDT
Title | Communication-Efficient Weighted Sampling and Quantile Summary for GBDT |
Authors | Ziyue Huang, Ke Yi |
Abstract | Gradient boosting decision tree (GBDT) is a powerful and widely-used machine learning model, which has achieved state-of-the-art performance in many academic areas and production environment. However, communication overhead is the main bottleneck in distributed training which can handle the massive data nowadays. In this paper, we propose two novel communication-efficient methods over distributed dataset to mitigate this problem, a weighted sampling approach by which we can estimate the information gain over a small subset efficiently, and distributed protocols for weighted quantile problem used in approximate tree learning. |
Tasks | |
Published | 2019-09-17 |
URL | https://arxiv.org/abs/1909.07633v1 |
https://arxiv.org/pdf/1909.07633v1.pdf | |
PWC | https://paperswithcode.com/paper/communication-efficient-weighted-sampling-and |
Repo | |
Framework | |
A Compact Light Field Camera for Real-Time Depth Estimation
Title | A Compact Light Field Camera for Real-Time Depth Estimation |
Authors | Yuriy Anisimov, Oliver Wasenmüller, Didier Stricker |
Abstract | Depth cameras are utilized in many applications. Recently light field approaches are increasingly being used for depth computation. While these approaches demonstrate the technical feasibility, they can not be brought into real-world application, since they have both a high computation time as well as a large design. Exactly these two drawbacks are overcome in this paper. For the first time, we present a depth camera based on the light field principle, which provides real-time depth information as well as a compact design. |
Tasks | Depth Estimation |
Published | 2019-07-25 |
URL | https://arxiv.org/abs/1907.10880v1 |
https://arxiv.org/pdf/1907.10880v1.pdf | |
PWC | https://paperswithcode.com/paper/a-compact-light-field-camera-for-real-time |
Repo | |
Framework | |
Learning Disentangled Representations of Satellite Image Time Series
Title | Learning Disentangled Representations of Satellite Image Time Series |
Authors | Eduardo Sanchez, Mathieu Serrurier, Mathias Ortner |
Abstract | In this paper, we investigate how to learn a suitable representation of satellite image time series in an unsupervised manner by leveraging large amounts of unlabeled data. Additionally , we aim to disentangle the representation of time series into two representations: a shared representation that captures the common information between the images of a time series and an exclusive representation that contains the specific information of each image of the time series. To address these issues, we propose a model that combines a novel component called cross-domain autoencoders with the variational autoencoder (VAE) and generative ad-versarial network (GAN) methods. In order to learn disentangled representations of time series, our model learns the multimodal image-to-image translation task. We train our model using satellite image time series from the Sentinel-2 mission. Several experiments are carried out to evaluate the obtained representations. We show that these disentangled representations can be very useful to perform multiple tasks such as image classification, image retrieval, image segmentation and change detection. |
Tasks | Image Classification, Image Retrieval, Image-to-Image Translation, Semantic Segmentation, Time Series |
Published | 2019-03-21 |
URL | http://arxiv.org/abs/1903.08863v1 |
http://arxiv.org/pdf/1903.08863v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-disentangled-representations-of-2 |
Repo | |
Framework | |
NLPExplorer: Exploring the Universe of NLP Papers
Title | NLPExplorer: Exploring the Universe of NLP Papers |
Authors | Monarch Parmar, Naman Jain, Pranjali Jain, P Jayakrishna Sahit, Soham Pachpande, Shruti Singh, Mayank Singh |
Abstract | Understanding the current research trends, problems, and their innovative solutions remains a bottleneck due to the ever-increasing volume of scientific articles. In this paper, we propose NLPExplorer, a completely automatic portal for indexing, searching, and visualizing Natural Language Processing (NLP) research volume. NLPExplorer presents interesting insights from papers, authors, venues, and topics. In contrast to previous topic modelling based approaches, we manually curate five course-grained non-exclusive topical categories namely Linguistic Target (Syntax, Discourse, etc.), Tasks (Tagging, Summarization, etc.), Approaches (unsupervised, supervised, etc.), Languages (English, Chinese,etc.) and Dataset types (news, clinical notes, etc.). Some of the novel features include a list of young popular authors, popular URLs, and datasets, a list of topically diverse papers and recent popular papers. Also, it provides temporal statistics such as yearwise popularity of topics, datasets, and seminal papers. To facilitate future research and system development, we make all the processed datasets accessible through API calls. The current system is available at http://nlpexplorer.org. |
Tasks | |
Published | 2019-10-16 |
URL | https://arxiv.org/abs/1910.07351v1 |
https://arxiv.org/pdf/1910.07351v1.pdf | |
PWC | https://paperswithcode.com/paper/nlpexplorer-exploring-the-universe-of-nlp |
Repo | |
Framework | |
The variational infomax autoencoder
Title | The variational infomax autoencoder |
Authors | Vincenzo Crescimanna, Bruce Graham |
Abstract | We propose the Variational InfoMax AutoEncoder (VIMAE), a method to train a generative model, maximizing the variational lower bound of the mutual information between the visible data and the hidden representation, maintaining bounded the capacity of the network. In the paper we investigate the capacity role in a neural network and deduce that a small capacity network tends to learn a more robust and disentangled representation than an high capacity one. Such observations are confirmed by the computational experiments. |
Tasks | |
Published | 2019-05-25 |
URL | https://arxiv.org/abs/1905.10549v1 |
https://arxiv.org/pdf/1905.10549v1.pdf | |
PWC | https://paperswithcode.com/paper/the-variational-infomax-autoencoder |
Repo | |
Framework | |
Automated Analysis, Reporting, and Archiving for Robotic Nondestructive Assay of Holdup Deposits
Title | Automated Analysis, Reporting, and Archiving for Robotic Nondestructive Assay of Holdup Deposits |
Authors | Heather Jones, Siri Maley, Kenji Yonekawa, Mohammadreza Mousaei, J. David Yesso, David Kohanbash, William Whittaker |
Abstract | To decommission deactivated gaseous diffusion enrichment facilities, miles of contaminated pipe must be measured. The current method requires thousands of manual measurements, repeated manual data transcription, and months of manual analysis. The Pipe Crawling Activity Measurement System (PCAMS), developed by Carnegie Mellon University and in commissioning for use at the DOE Portsmouth Gaseous Diffusion Enrichment Facility, uses a robot to measure Uranium-235 from inside pipes and automatically log the data. Radiation measurements, as well as imagery, geometric modeling, and precise measurement positioning data are digitally transferred to the PCAMS server. On the server, data can be automatically processed in minutes and summarized for analyst review. Measurement reports are auto-generated with the push of a button. A database specially-configured to hold heterogeneous data such as spectra, images, and robot trajectories serves as archive. This paper outlines the features and design of the PCAMS Post-Processing Software, currently in commissioning for use at the Portsmouth Gaseous Diffusion Enrichment Facility. The analysis process, the analyst interface to the system, and the content of auto-generated reports are each described. Example pipe-interior geometric surface models, illustration of how key report features apply in operational runs, and user feedback are discussed. |
Tasks | |
Published | 2019-01-29 |
URL | http://arxiv.org/abs/1901.10795v1 |
http://arxiv.org/pdf/1901.10795v1.pdf | |
PWC | https://paperswithcode.com/paper/automated-analysis-reporting-and-archiving |
Repo | |
Framework | |
Neural Module Networks for Reasoning over Text
Title | Neural Module Networks for Reasoning over Text |
Authors | Nitish Gupta, Kevin Lin, Dan Roth, Sameer Singh, Matt Gardner |
Abstract | Answering compositional questions that require multiple steps of reasoning against text is challenging, especially when they involve discrete, symbolic operations. Neural module networks (NMNs) learn to parse such questions as executable programs composed of learnable modules, performing well on synthetic visual QA domains. However, we find that it is challenging to learn these models for non-synthetic questions on open-domain text, where a model needs to deal with the diversity of natural language and perform a broader range of reasoning. We extend NMNs by: (a) introducing modules that reason over a paragraph of text, performing symbolic reasoning (such as arithmetic, sorting, counting) over numbers and dates in a probabilistic and differentiable manner; and (b) proposing an unsupervised auxiliary loss to help extract arguments associated with the events in text. Additionally, we show that a limited amount of heuristically-obtained question program and intermediate module output supervision provides sufficient inductive bias for accurate learning. Our proposed model significantly outperforms state-of-the-art models on a subset of the DROP dataset that poses a variety of reasoning challenges that are covered by our modules. |
Tasks | |
Published | 2019-12-10 |
URL | https://arxiv.org/abs/1912.04971v2 |
https://arxiv.org/pdf/1912.04971v2.pdf | |
PWC | https://paperswithcode.com/paper/neural-module-networks-for-reasoning-over-1 |
Repo | |
Framework | |
Overcoming Practical Issues of Deep Active Learning and its Applications on Named Entity Recognition
Title | Overcoming Practical Issues of Deep Active Learning and its Applications on Named Entity Recognition |
Authors | Haw-Shiuan Chang, Shankar Vembu, Sunil Mohan, Rheeya Uppaal, Andrew McCallum |
Abstract | Existing deep active learning algorithms achieve impressive sampling efficiency on natural language processing tasks. However, they exhibit several weaknesses in practice, including (a) inability to use uncertainty sampling with black-box models, (b) lack of robustness to noise in labeling, (c) lack of transparency. In response, we propose a transparent batch active sampling framework by estimating the error decay curves of multiple feature-defined subsets of the data. Experiments on four named entity recognition (NER) tasks demonstrate that the proposed methods significantly outperform diversification-based methods for black-box NER taggers and can make the sampling process more robust to labeling noise when combined with uncertainty-based methods. Furthermore, the analysis of experimental results sheds light on the weaknesses of different active sampling strategies, and when traditional uncertainty-based or diversification-based methods can be expected to work well. |
Tasks | Active Learning, Named Entity Recognition |
Published | 2019-11-17 |
URL | https://arxiv.org/abs/1911.07335v1 |
https://arxiv.org/pdf/1911.07335v1.pdf | |
PWC | https://paperswithcode.com/paper/overcoming-practical-issues-of-deep-active |
Repo | |
Framework | |
Faster and Simpler SNN Simulation with Work Queues
Title | Faster and Simpler SNN Simulation with Work Queues |
Authors | Dennis Bautembach, Iason Oikonomidis, Nikolaos Kyriazis, Antonis Argyros |
Abstract | We present a clock-driven Spiking Neural Network simulator which is up to 3x faster than the state of the art while, at the same time, being more general and requiring less programming effort on both the user’s and maintainer’s side. This is made possible by designing our pipeline around “work queues” which act as interfaces between stages and greatly reduce implementation complexity. We evaluate our work using three well-established SNN models on a series of benchmarks. |
Tasks | |
Published | 2019-12-16 |
URL | https://arxiv.org/abs/1912.07423v2 |
https://arxiv.org/pdf/1912.07423v2.pdf | |
PWC | https://paperswithcode.com/paper/faster-and-simpler-snn-simulation-with-work |
Repo | |
Framework | |
Active learning in the geometric block model
Title | Active learning in the geometric block model |
Authors | Eli Chien, Antonia Maria Tulino, Jaime Llorca |
Abstract | The geometric block model is a recently proposed generative model for random graphs that is able to capture the inherent geometric properties of many community detection problems, providing more accurate characterizations of practical community structures compared with the popular stochastic block model. Galhotra et al. recently proposed a motif-counting algorithm for unsupervised community detection in the geometric block model that is proved to be near-optimal. They also characterized the regimes of the model parameters for which the proposed algorithm can achieve exact recovery. In this work, we initiate the study of active learning in the geometric block model. That is, we are interested in the problem of exactly recovering the community structure of random graphs following the geometric block model under arbitrary model parameters, by possibly querying the labels of a limited number of chosen nodes. We propose two active learning algorithms that combine the idea of motif-counting with two different label query policies. Our main contribution is to show that sampling the labels of a vanishingly small fraction of nodes (sub-linear in the total number of nodes) is sufficient to achieve exact recovery in the regimes under which the state-of-the-art unsupervised method fails. We validate the superior performance of our algorithms via numerical simulations on both real and synthetic datasets. |
Tasks | Active Learning, Community Detection |
Published | 2019-11-15 |
URL | https://arxiv.org/abs/1912.06570v1 |
https://arxiv.org/pdf/1912.06570v1.pdf | |
PWC | https://paperswithcode.com/paper/active-learning-in-the-geometric-block-model |
Repo | |
Framework | |
Procedural Reasoning Networks for Understanding Multimodal Procedures
Title | Procedural Reasoning Networks for Understanding Multimodal Procedures |
Authors | Mustafa Sercan Amac, Semih Yagcioglu, Aykut Erdem, Erkut Erdem |
Abstract | This paper addresses the problem of comprehending procedural commonsense knowledge. This is a challenging task as it requires identifying key entities, keeping track of their state changes, and understanding temporal and causal relations. Contrary to most of the previous work, in this study, we do not rely on strong inductive bias and explore the question of how multimodality can be exploited to provide a complementary semantic signal. Towards this end, we introduce a new entity-aware neural comprehension model augmented with external relational memory units. Our model learns to dynamically update entity states in relation to each other while reading the text instructions. Our experimental analysis on the visual reasoning tasks in the recently proposed RecipeQA dataset reveals that our approach improves the accuracy of the previously reported models by a large margin. Moreover, we find that our model learns effective dynamic representations of entities even though we do not use any supervision at the level of entity states. |
Tasks | Visual Reasoning |
Published | 2019-09-19 |
URL | https://arxiv.org/abs/1909.08859v1 |
https://arxiv.org/pdf/1909.08859v1.pdf | |
PWC | https://paperswithcode.com/paper/procedural-reasoning-networks-for |
Repo | |
Framework | |
Coincidence, Categorization, and Consolidation: Learning to Recognize Sounds with Minimal Supervision
Title | Coincidence, Categorization, and Consolidation: Learning to Recognize Sounds with Minimal Supervision |
Authors | Aren Jansen, Daniel P. W. Ellis, Shawn Hershey, R. Channing Moore, Manoj Plakal, Ashok C. Popat, Rif A. Saurous |
Abstract | Humans do not acquire perceptual abilities in the way we train machines. While machine learning algorithms typically operate on large collections of randomly-chosen, explicitly-labeled examples, human acquisition relies more heavily on multimodal unsupervised learning (as infants) and active learning (as children). With this motivation, we present a learning framework for sound representation and recognition that combines (i) a self-supervised objective based on a general notion of unimodal and cross-modal coincidence, (ii) a clustering objective that reflects our need to impose categorical structure on our experiences, and (iii) a cluster-based active learning procedure that solicits targeted weak supervision to consolidate categories into relevant semantic classes. By training a combined sound embedding/clustering/classification network according to these criteria, we achieve a new state-of-the-art unsupervised audio representation and demonstrate up to a 20-fold reduction in the number of labels required to reach a desired classification performance. |
Tasks | Active Learning |
Published | 2019-11-14 |
URL | https://arxiv.org/abs/1911.05894v1 |
https://arxiv.org/pdf/1911.05894v1.pdf | |
PWC | https://paperswithcode.com/paper/coincidence-categorization-and-consolidation |
Repo | |
Framework | |