Paper Group ANR 1427
Object-Aware Instance Labeling for Weakly Supervised Object Detection. Large e-retailer image dataset for visual search and product classification. Accurate Global Trajectory Alignment using Poles and Road Markings. Throughput Prediction of Asynchronous SGD in TensorFlow. Real-World Image Datasets for Federated Learning. RevealNet: Seeing Behind Ob …
Object-Aware Instance Labeling for Weakly Supervised Object Detection
Title | Object-Aware Instance Labeling for Weakly Supervised Object Detection |
Authors | Satoshi Kosugi, Toshihiko Yamasaki, Kiyoharu Aizawa |
Abstract | Weakly supervised object detection (WSOD), where a detector is trained with only image-level annotations, is attracting more and more attention. As a method to obtain a well-performing detector, the detector and the instance labels are updated iteratively. In this study, for more efficient iterative updating, we focus on the instance labeling problem, a problem of which label should be annotated to each region based on the last localization result. Instead of simply labeling the top-scoring region and its highly overlapping regions as positive and others as negative, we propose more effective instance labeling methods as follows. First, to solve the problem that regions covering only some parts of the object tend to be labeled as positive, we find regions covering the whole object focusing on the context classification loss. Second, considering the situation where the other objects contained in the image can be labeled as negative, we impose a spatial restriction on regions labeled as negative. Using these instance labeling methods, we train the detector on the PASCAL VOC 2007 and 2012 and obtain significantly improved results compared with other state-of-the-art approaches. |
Tasks | Object Detection, Weakly Supervised Object Detection |
Published | 2019-08-10 |
URL | https://arxiv.org/abs/1908.03792v1 |
https://arxiv.org/pdf/1908.03792v1.pdf | |
PWC | https://paperswithcode.com/paper/object-aware-instance-labeling-for-weakly |
Repo | |
Framework | |
Large e-retailer image dataset for visual search and product classification
Title | Large e-retailer image dataset for visual search and product classification |
Authors | Arnaud Bellétoile |
Abstract | Recent results of deep convolutional networks in visual recognition challenges open the path to a whole new set of disruptive user experiences such as visual search or recommendation. The list of companies offering this type of service is growing everyday but the adoption rate and the relevancy of results may vary a lot. We believe that the availability of large and diverse datasets is a necessary condition to improve the relevancy of such recommendation systems and facilitate their adoption. For that purpose, we wish to share with the community this dataset of more than 12M images of the 7M products of our online store classified into 5K categories. This original dataset is introduced in this article and several features are described. We also present some aspects of the winning solutions of our image classification challenge that was organized on the Kaggle platform around this set of images. |
Tasks | Image Classification, Recommendation Systems |
Published | 2019-09-18 |
URL | https://arxiv.org/abs/1909.08612v1 |
https://arxiv.org/pdf/1909.08612v1.pdf | |
PWC | https://paperswithcode.com/paper/large-e-retailer-image-dataset-for-visual |
Repo | |
Framework | |
Accurate Global Trajectory Alignment using Poles and Road Markings
Title | Accurate Global Trajectory Alignment using Poles and Road Markings |
Authors | Haohao Hu, Marc Sons, Christoph Stiller |
Abstract | Currently, digital maps are indispensable for automated driving. However, due to the low precision and reliability of GNSS particularly in urban areas, fusing trajectories of independent recording sessions and different regions is a challenging task. To bypass the flaws from direct incorporation of GNSS measurements for geo-referencing, the usage of aerial imagery seems promising. Furthermore, more accurate geo-referencing improves the global map accuracy and allows to estimate the sensor calibration error. In this paper, we present a novel geo-referencing approach to align trajectories to aerial imagery using poles and road markings. To match extracted features from sensor observations to aerial imagery landmarks robustly, a RANSAC-based matching approach is applied in a sliding window. For that, we assume that the trajectories are roughly referenced to the imagery which can be achieved by rough GNSS measurements from a low-cost GNSS receiver. Finally, we align the initial trajectories precisely to the aerial imagery by minimizing a geometric cost function comprising all determined matches. Evaluations performed on data recorded in Karlsruhe, Germany show that our algorithm yields trajectories which are accurately referenced to the used aerial imagery. |
Tasks | Calibration |
Published | 2019-03-25 |
URL | http://arxiv.org/abs/1903.10205v1 |
http://arxiv.org/pdf/1903.10205v1.pdf | |
PWC | https://paperswithcode.com/paper/accurate-global-trajectory-alignment-using |
Repo | |
Framework | |
Throughput Prediction of Asynchronous SGD in TensorFlow
Title | Throughput Prediction of Asynchronous SGD in TensorFlow |
Authors | Zhuojin Li, Wumo Yan, Marco Paolieri, Leana Golubchik |
Abstract | Modern machine learning frameworks can train neural networks using multiple nodes in parallel, each computing parameter updates with stochastic gradient descent (SGD) and sharing them asynchronously through a central parameter server. Due to communication overhead and bottlenecks, the total throughput of SGD updates in a cluster scales sublinearly, saturating as the number of nodes increases. In this paper, we present a solution to predicting training throughput from profiling traces collected from a single-node configuration. Our approach is able to model the interaction of multiple nodes and the scheduling of concurrent transmissions between the parameter server and each node. By accounting for the dependencies between received parts and pending computations, we predict overlaps between computation and communication and generate synthetic execution traces for configurations with multiple nodes. We validate our approach on TensorFlow training jobs for popular image classification neural networks, on AWS and on our in-house cluster, using nodes equipped with GPUs or only with CPUs. We also investigate the effects of data transmission policies used in TensorFlow and the accuracy of our approach when combined with optimizations of the transmission schedule. |
Tasks | Image Classification |
Published | 2019-11-12 |
URL | https://arxiv.org/abs/1911.04650v2 |
https://arxiv.org/pdf/1911.04650v2.pdf | |
PWC | https://paperswithcode.com/paper/throughput-prediction-of-asynchronous-sgd-in |
Repo | |
Framework | |
Real-World Image Datasets for Federated Learning
Title | Real-World Image Datasets for Federated Learning |
Authors | Jiahuan Luo, Xueyang Wu, Yun Luo, Anbu Huang, Yunfeng Huang, Yang Liu, Qiang Yang |
Abstract | Federated learning is a new machine learning paradigm which allows data parties to build machine learning models collaboratively while keeping their data secure and private. While research efforts on federated learning have been growing tremendously in the past two years, most existing works still depend on pre-existing public datasets and artificial partitions to simulate data federations due to the lack of high-quality labeled data generated from real-world edge applications. Consequently, advances on benchmark and model evaluations for federated learning have been lagging behind. In this paper, we introduce a real-world image dataset. The dataset contains more than 900 images generated from 26 street cameras and 7 object categories annotated with detailed bounding box. The data distribution is non-IID and unbalanced, reflecting the characteristic real-world federated learning scenarios. Based on this dataset, we implemented two mainstream object detection algorithms (YOLO and Faster R-CNN) and provided an extensive benchmark on model performance, efficiency, and communication in a federated learning setting. Both the dataset and algorithms are made publicly available. |
Tasks | Object Detection |
Published | 2019-10-14 |
URL | https://arxiv.org/abs/1910.11089v2 |
https://arxiv.org/pdf/1910.11089v2.pdf | |
PWC | https://paperswithcode.com/paper/real-world-image-datasets-for-federated |
Repo | |
Framework | |
RevealNet: Seeing Behind Objects in RGB-D Scans
Title | RevealNet: Seeing Behind Objects in RGB-D Scans |
Authors | Ji Hou, Angela Dai, Matthias Nießner |
Abstract | During 3D reconstruction, it is often the case that people cannot scan each individual object from all views, resulting in missing geometry in the captured scan. This missing geometry can be fundamentally limiting for many applications, e.g., a robot needs to know the unseen geometry to perform a precise grasp on an object. Thus, we introduce the task of semantic instance completion: from an incomplete RGB-D scan of a scene, we aim to detect the individual object instances and infer their complete object geometry. This will open up new possibilities for interactions with objects in a scene, for instance for virtual or robotic agents. We tackle this problem by introducing RevealNet, a new data-driven approach that jointly detects object instances and predicts their complete geometry. This enables a semantically meaningful decomposition of a scanned scene into individual, complete 3D objects, including hidden and unobserved object parts. RevealNet is an end-to-end 3D neural network architecture that leverages joint color and geometry feature learning. The fully-convolutional nature of our 3D network enables efficient inference of semantic instance completion for 3D scans at scale of large indoor environments in a single forward pass. We show that predicting complete object geometry improves both 3D detection and instance segmentation performance. We evaluate on both real and synthetic scan benchmark data for the new task, where we outperform state-of-the-art approaches by over 15 in mAP@0.5 on ScanNet, and over 18 in mAP@0.5 on SUNCG. |
Tasks | 3D Reconstruction, 3D Semantic Instance Segmentation, Instance Segmentation, Semantic Segmentation |
Published | 2019-04-26 |
URL | https://arxiv.org/abs/1904.12012v3 |
https://arxiv.org/pdf/1904.12012v3.pdf | |
PWC | https://paperswithcode.com/paper/3d-sic-3d-semantic-instance-completion-for |
Repo | |
Framework | |
ON-TRAC Consortium End-to-End Speech Translation Systems for the IWSLT 2019 Shared Task
Title | ON-TRAC Consortium End-to-End Speech Translation Systems for the IWSLT 2019 Shared Task |
Authors | Ha Nguyen, Natalia Tomashenko, Marcely Zanon Boito, Antoine Caubriere, Fethi Bougares, Mickael Rouvier, Laurent Besacier, Yannick Esteve |
Abstract | This paper describes the ON-TRAC Consortium translation systems developed for the end-to-end model task of IWSLT Evaluation 2019 for the English-to-Portuguese language pair. ON-TRAC Consortium is composed of researchers from three French academic laboratories: LIA (Avignon Universit'e), LIG (Universit'e Grenoble Alpes), and LIUM (Le Mans Universit'e). A single end-to-end model built as a neural encoder-decoder architecture with attention mechanism was used for two primary submissions corresponding to the two EN-PT evaluations sets: (1) TED (MuST-C) and (2) How2. In this paper, we notably investigate impact of pooling heterogeneous corpora for training, impact of target tokenization (characters or BPEs), impact of speech input segmentation and we also compare our best end-to-end model (BLEU of 26.91 on MuST-C and 43.82 on How2 validation sets) to a pipeline (ASR+MT) approach. |
Tasks | Tokenization |
Published | 2019-10-30 |
URL | https://arxiv.org/abs/1910.13689v1 |
https://arxiv.org/pdf/1910.13689v1.pdf | |
PWC | https://paperswithcode.com/paper/on-trac-consortium-end-to-end-speech |
Repo | |
Framework | |
Comprehensive decision-strategy space exploration for efficient territorial planning strategies
Title | Comprehensive decision-strategy space exploration for efficient territorial planning strategies |
Authors | Olivier Billaud, Maxence Soubeyrand, Sandra Luque, Maxime Lenormand |
Abstract | Multi-Criteria Decision Analysis (MCDA) is a well-known decision support tool that can be used in a wide variety of contexts. It is particularly useful for territorial planning in situations where several actors with different, and sometimes contradictory, point of views have to take a decision regarding land use development. While the impact of the weights used to represent the relative importance of criteria has been widely studied in the recent literature, the impact of order weights determination have rarely been investigated. This paper presents a spatial sensitivity analysis to assess the impact of order weights determination in Multi-Criteria Analysis by Ordered Weighted Averaging. We propose a methodology based on an efficient exploration of the decision-strategy space defined by the level of risk and trade-off in the decision process. We illustrate our approach with a land use planning process in the South of France. The objective is to find suitable areas for urban development while preserving green areas and their associated ecosystem services. The ecosystem service approach has indeed the potential to widen the scope of traditional landscape-ecological planning by including ecosystem-based benefits, including social and economic benefits, green infrastructures and biophysical parameters in urban and territorial planning. We show that in this particular case the decision-strategy space can be divided into four clusters. Each of them is associated with a map summarizing the average spatial suitability distribution used to identify potential areas for urban development. We finally demonstrate the pertinence of a spatial variance within-cluster analysis to disentangle the relationship between risk and trade-off values. |
Tasks | Efficient Exploration |
Published | 2019-11-26 |
URL | https://arxiv.org/abs/1911.11460v1 |
https://arxiv.org/pdf/1911.11460v1.pdf | |
PWC | https://paperswithcode.com/paper/comprehensive-decision-strategy-space |
Repo | |
Framework | |
Assessing the Frontier: Active Learning, Model Accuracy, and Multi-objective Materials Discovery and Optimization
Title | Assessing the Frontier: Active Learning, Model Accuracy, and Multi-objective Materials Discovery and Optimization |
Authors | Zachary del Rosario, Matthias Rupp, Yoolhee Kim, Erin Antono, Julia Ling |
Abstract | Discovering novel materials can be greatly accelerated by iterative machine learning-informed proposal of candidates—active learning. However, standard \emph{global-scope error} metrics for model quality are not predictive of discovery performance, and can be misleading. We introduce the notion of \emph{Pareto shell-scope error} to help judge the suitability of a model for proposing material candidates. Further, through synthetic cases and a thermoelectric dataset, we probe the relation between acquisition function fidelity and active learning performance. Results suggest novel diagnostic tools, as well as new insights for acquisition function design. |
Tasks | Active Learning |
Published | 2019-11-06 |
URL | https://arxiv.org/abs/1911.03224v3 |
https://arxiv.org/pdf/1911.03224v3.pdf | |
PWC | https://paperswithcode.com/paper/assessing-the-frontier-active-learning-model |
Repo | |
Framework | |
On the Robustness of Deep Learning-predicted Contention Models for Network Calculus
Title | On the Robustness of Deep Learning-predicted Contention Models for Network Calculus |
Authors | Fabien Geyer, Steffen Bondorf |
Abstract | The network calculus (NC) analysis takes a simple model consisting of a network of schedulers and data flows crossing them. A number of analysis “building blocks” can then be applied to capture the model without imposing pessimistic assumptions like self-contention on tandems of servers. Yet, adding pessimism cannot always be avoided. To compute the best bound on a single flow’s end-to-end delay thus boils down to finding the least pessimistic contention models for all tandems of schedulers in the network - and an exhaustive search can easily become a very resource intensive task. The literature proposes a promising solution to this dilemma: a heuristic making use of machine learning (ML) predictions inside the NC analysis. While results of this work are promising in terms of delay bound quality and computational effort, there is little to no insight on why a prediction is made or if the trained machine can achieve similarly striking results in networks vastly differing from its training data. In this paper we address these pending questions. We evaluate the influence of the training data and its features on accuracy, impact and scalability. Additionally, we contribute an extension of the method by predicting the best $n$ contention model alternatives in order to achieve increased robustness for its application outside the training data. Our numerical evaluation shows that good accuracy can still be achieved on large networks although we restrict the training to networks that are two orders of magnitude smaller. |
Tasks | |
Published | 2019-11-24 |
URL | https://arxiv.org/abs/1911.10522v1 |
https://arxiv.org/pdf/1911.10522v1.pdf | |
PWC | https://paperswithcode.com/paper/on-the-robustness-of-deep-learning-predicted |
Repo | |
Framework | |
Domain Generalization via Multidomain Discriminant Analysis
Title | Domain Generalization via Multidomain Discriminant Analysis |
Authors | Shoubo Hu, Kun Zhang, Zhitang Chen, Laiwan Chan |
Abstract | Domain generalization (DG) aims to incorporate knowledge from multiple source domains into a single model that could generalize well on unseen target domains. This problem is ubiquitous in practice since the distributions of the target data may rarely be identical to those of the source data. In this paper, we propose Multidomain Discriminant Analysis (MDA) to address DG of classification tasks in general situations. MDA learns a domain-invariant feature transformation that aims to achieve appealing properties, including a minimal divergence among domains within each class, a maximal separability among classes, and overall maximal compactness of all classes. Furthermore, we provide the bounds on excess risk and generalization error by learning theory analysis. Comprehensive experiments on synthetic and real benchmark datasets demonstrate the effectiveness of MDA. |
Tasks | Domain Generalization |
Published | 2019-07-25 |
URL | https://arxiv.org/abs/1907.11216v1 |
https://arxiv.org/pdf/1907.11216v1.pdf | |
PWC | https://paperswithcode.com/paper/domain-generalization-via-multidomain |
Repo | |
Framework | |
Show Me Your Account: Detecting MMORPG Game Bot Leveraging Financial Analysis with LSTM
Title | Show Me Your Account: Detecting MMORPG Game Bot Leveraging Financial Analysis with LSTM |
Authors | Kyung Ho Park, Eunjo Lee, Huy Kang Kim |
Abstract | With the rapid growth of MMORPG market, game bot detection has become an essential task for maintaining stable in-game ecosystem. To classify bots from normal users, detection methods are proposed in both game client and server-side. Among various classification methods, data mining method in server-side captured unique characteristics of bots efficiently. For features used in data mining, behavioral and social actions of character are analyzed with numerous algorithms. However, bot developers can evade the previous detection methods by changing bot’s activities continuously. Eventually, overall maintenance cost increases because the selected features need to be updated along with the change of bot’s behavior. To overcome this limitation, we propose improved bot detection method with financial analysis. As bot’s activity absolutely necessitates the change of financial status, analyzing financial fluctuation effectively captures bots as a key feature. We trained and tested model with actual data of Aion, a leading MMORPG in Asia. Leveraging that LSTM efficiently recognizes time-series movement of data, we achieved meaningful detection performance. Further on this model, we expect sustainable bot detection system in the near future. |
Tasks | Time Series |
Published | 2019-08-10 |
URL | https://arxiv.org/abs/1908.03748v1 |
https://arxiv.org/pdf/1908.03748v1.pdf | |
PWC | https://paperswithcode.com/paper/show-me-your-account-detecting-mmorpg-game |
Repo | |
Framework | |
Heterogeneous Stochastic Interactions for Multiple Agents in a Multi-armed Bandit Problem
Title | Heterogeneous Stochastic Interactions for Multiple Agents in a Multi-armed Bandit Problem |
Authors | Udari Madhushani, Naomi Ehrich Leonard |
Abstract | We define and analyze a multi-agent multi-armed bandit problem in which decision-making agents can observe the choices and rewards of their neighbors. Neighbors are defined by a network graph with heterogeneous and stochastic interconnections. These interactions are determined by the sociability of each agent, which corresponds to the probability that the agent observes its neighbors. We design an algorithm for each agent to maximize its own expected cumulative reward and prove performance bounds that depend on the sociability of the agents and the network structure. We use the bounds to predict the rank ordering of agents according to their performance and verify the accuracy analytically and computationally. |
Tasks | Decision Making |
Published | 2019-05-21 |
URL | https://arxiv.org/abs/1905.08731v1 |
https://arxiv.org/pdf/1905.08731v1.pdf | |
PWC | https://paperswithcode.com/paper/heterogeneous-stochastic-interactions-for |
Repo | |
Framework | |
A Stochastic Tensor Method for Non-convex Optimization
Title | A Stochastic Tensor Method for Non-convex Optimization |
Authors | Aurelien Lucchi, Jonas Kohler |
Abstract | We present a stochastic optimization method that uses a fourth-order regularized model to find local minima of smooth and potentially non-convex objective functions. This algorithm uses sub-sampled derivatives instead of exact quantities and its implementation relies on tensor-vector products only. The proposed approach is shown to find an $(\epsilon_1,\epsilon_2)$-second-order critical point in at most $\bigO\left(\max\left(\epsilon_1^{-4/3}, \epsilon_2^{-2}\right)\right)$ iterations, thereby matching the rate of deterministic approaches. Furthermore, we discuss a practical implementation of this approach for objective functions with a finite-sum structure, as well as characterize the total computational complexity, for both sampling with and without replacement. Finally, we identify promising directions of future research to further improve the complexity of the discussed algorithm. |
Tasks | Stochastic Optimization |
Published | 2019-11-23 |
URL | https://arxiv.org/abs/1911.10367v1 |
https://arxiv.org/pdf/1911.10367v1.pdf | |
PWC | https://paperswithcode.com/paper/a-stochastic-tensor-method-for-non-convex |
Repo | |
Framework | |
Low-Shot Learning from Imaginary 3D Model
Title | Low-Shot Learning from Imaginary 3D Model |
Authors | Frederik Pahde, Mihai Puscas, Jannik Wolff, Tassilo Klein, Nicu Sebe, Moin Nabi |
Abstract | Since the advent of deep learning, neural networks have demonstrated remarkable results in many visual recognition tasks, constantly pushing the limits. However, the state-of-the-art approaches are largely unsuitable in scarce data regimes. To address this shortcoming, this paper proposes employing a 3D model, which is derived from training images. Such a model can then be used to hallucinate novel viewpoints and poses for the scarce samples of the few-shot learning scenario. A self-paced learning approach allows for the selection of a diverse set of high-quality images, which facilitates the training of a classifier. The performance of the proposed approach is showcased on the fine-grained CUB-200-2011 dataset in a few-shot setting and significantly improves our baseline accuracy. |
Tasks | Few-Shot Learning |
Published | 2019-01-04 |
URL | http://arxiv.org/abs/1901.01868v1 |
http://arxiv.org/pdf/1901.01868v1.pdf | |
PWC | https://paperswithcode.com/paper/low-shot-learning-from-imaginary-3d-model |
Repo | |
Framework | |