Paper Group ANR 419
Proper Learning of Linear Dynamical Systems as a Non-Commutative Polynomial Optimisation Problem. EGO-CH: Dataset and Fundamental Tasks for Visitors BehavioralUnderstanding using Egocentric Vision. SAPIEN: A SimulAted Part-based Interactive ENvironment. A Price-Per-Attention Auction Scheme Using Mouse Cursor Information. Citation Data of Czech Apex …
Proper Learning of Linear Dynamical Systems as a Non-Commutative Polynomial Optimisation Problem
Title | Proper Learning of Linear Dynamical Systems as a Non-Commutative Polynomial Optimisation Problem |
Authors | Quan Zhou, Jakub Marecek |
Abstract | There has been much recent progress in forecasting the next observation of a linear dynamical system (LDS), which is known as the improper learning, as well as in the estimation of its system matrices, which is known as the proper learning of LDS. We present an approach to proper learning of LDS, which in spite of the non-convexity of the problem, guarantees global convergence of numerical solutions to a least-squares estimator. We present promising computational results. |
Tasks | |
Published | 2020-02-04 |
URL | https://arxiv.org/abs/2002.01444v2 |
https://arxiv.org/pdf/2002.01444v2.pdf | |
PWC | https://paperswithcode.com/paper/proper-learning-of-linear-dynamical-systems |
Repo | |
Framework | |
EGO-CH: Dataset and Fundamental Tasks for Visitors BehavioralUnderstanding using Egocentric Vision
Title | EGO-CH: Dataset and Fundamental Tasks for Visitors BehavioralUnderstanding using Egocentric Vision |
Authors | Francesco Ragusa, Antonino Furnari, Sebastiano Battiato, Giovanni Signorello, Giovanni Maria Farinella |
Abstract | Equipping visitors of a cultural site with a wearable device allows to easily collect information about their preferences which can be exploited to improve the fruition of cultural goods with augmented reality. Moreover, egocentric video can be processed using computer vision and machine learning to enable an automated analysis of visitors’ behavior. The inferred information can be used both online to assist the visitor and offline to support the manager of the site. Despite the positive impact such technologies can have in cultural heritage, the topic is currently understudied due to the limited number of public datasets suitable to study the considered problems. To address this issue, in this paper we propose EGOcentric-Cultural Heritage (EGO-CH), the first dataset of egocentric videos for visitors’ behavior understanding in cultural sites. The dataset has been collected in two cultural sites and includes more than $27$ hours of video acquired by $70$ subjects, with labels for $26$ environments and over $200$ different Points of Interest. A large subset of the dataset, consisting of $60$ videos, is associated with surveys filled out by real visitors. To encourage research on the topic, we propose $4$ challenging tasks (room-based localization, point of interest/object recognition, object retrieval and survey prediction) useful to understand visitors’ behavior and report baseline results on the dataset. |
Tasks | Object Recognition |
Published | 2020-02-03 |
URL | https://arxiv.org/abs/2002.00899v1 |
https://arxiv.org/pdf/2002.00899v1.pdf | |
PWC | https://paperswithcode.com/paper/ego-ch-dataset-and-fundamental-tasks-for |
Repo | |
Framework | |
SAPIEN: A SimulAted Part-based Interactive ENvironment
Title | SAPIEN: A SimulAted Part-based Interactive ENvironment |
Authors | Fanbo Xiang, Yuzhe Qin, Kaichun Mo, Yikuan Xia, Hao Zhu, Fangchen Liu, Minghua Liu, Hanxiao Jiang, Yifu Yuan, He Wang, Li Yi, Angel X. Chang, Leonidas J. Guibas, Hao Su |
Abstract | Building home assistant robots has long been a pursuit for vision and robotics researchers. To achieve this task, a simulated environment with physically realistic simulation, sufficient articulated objects, and transferability to the real robot is indispensable. Existing environments achieve these requirements for robotics simulation with different levels of simplification and focus. We take one step further in constructing an environment that supports household tasks for training robot learning algorithm. Our work, SAPIEN, is a realistic and physics-rich simulated environment that hosts a large-scale set for articulated objects. Our SAPIEN enables various robotic vision and interaction tasks that require detailed part-level understanding.We evaluate state-of-the-art vision algorithms for part detection and motion attribute recognition as well as demonstrate robotic interaction tasks using heuristic approaches and reinforcement learning algorithms. We hope that our SAPIEN can open a lot of research directions yet to be explored, including learning cognition through interaction, part motion discovery, and construction of robotics-ready simulated game environment. |
Tasks | |
Published | 2020-03-19 |
URL | https://arxiv.org/abs/2003.08515v1 |
https://arxiv.org/pdf/2003.08515v1.pdf | |
PWC | https://paperswithcode.com/paper/sapien-a-simulated-part-based-interactive |
Repo | |
Framework | |
A Price-Per-Attention Auction Scheme Using Mouse Cursor Information
Title | A Price-Per-Attention Auction Scheme Using Mouse Cursor Information |
Authors | Ioannis Arapakis, Antonio Penta, Hideo Joho, Luis A. Leiva |
Abstract | Payments in online ad auctions are typically derived from click-through rates, so that advertisers do not pay for ineffective ads. But advertisers often care about more than just clicks. That is, for example, if they aim to raise brand awareness or visibility. There is thus an opportunity to devise a more effective ad pricing paradigm, in which ads are paid only if they are actually noticed. This article contributes a novel auction format based on a pay-per-attention (PPA) scheme. We show that the PPA auction inherits the desirable properties (strategy-proofness and efficiency) as its pay-per-impression and pay-per-click counterparts, and that it also compares favourably in terms of revenues. To make the PPA format feasible, we also contribute a scalable diagnostic technology to predict user attention to ads in sponsored search using raw mouse cursor coordinates only, regardless of the page content and structure. We use the user attention predictions in numerical simulations to evaluate the PPA auction scheme. Our results show that, in relevant economic settings, the PPA revenues would be strictly higher than the existing auction payment schemes. |
Tasks | |
Published | 2020-01-21 |
URL | https://arxiv.org/abs/2001.07803v1 |
https://arxiv.org/pdf/2001.07803v1.pdf | |
PWC | https://paperswithcode.com/paper/a-price-per-attention-auction-scheme-using |
Repo | |
Framework | |
Citation Data of Czech Apex Courts
Title | Citation Data of Czech Apex Courts |
Authors | Jakub Harašta, Tereza Novotná, Jaromír Šavelka |
Abstract | In this paper, we introduce the citation data of the Czech apex courts (Supreme Court, Supreme Administrative Court and Constitutional Court). This dataset was automatically extracted from the corpus of texts of Czech court decisions - CzCDC 1.0. We obtained the citation data by building the natural language processing pipeline for extraction of the court decision identifiers. The pipeline included the (i) document segmentation model and the (ii) reference recognition model. Furthermore, the dataset was manually processed to achieve high-quality citation data as a base for subsequent qualitative and quantitative analyses. The dataset will be made available to the general public. |
Tasks | |
Published | 2020-02-06 |
URL | https://arxiv.org/abs/2002.02224v1 |
https://arxiv.org/pdf/2002.02224v1.pdf | |
PWC | https://paperswithcode.com/paper/citation-data-of-czech-apex-courts |
Repo | |
Framework | |
Prediction of MRI Hardware Failures based on Image Features using Ensemble Learning
Title | Prediction of MRI Hardware Failures based on Image Features using Ensemble Learning |
Authors | Nadine Kuhnert, Lea Pflüger, Andreas Maier |
Abstract | In order to ensure trouble-free operation, prediction of hardware failures is essential. This applies especially to medical systems. Our goal is to determine hardware which needs to be exchanged before failing. In this work, we focus on predicting failures of 20-channel Head/Neck coils using image-related measurements. Thus, we aim to solve a classification problem with two classes, normal and broken coil. To solve this problem, we use data of two different levels. One level refers to one-dimensional features per individual coil channel on which we found a fully connected neural network to perform best. The other data level uses matrices which represent the overall coil condition and feeds a different neural network. We stack the predictions of those two networks and train a Random Forest classifier as the ensemble learner. Thus, combining insights of both trained models improves the prediction results and allows us to determine the coil’s condition with an F-score of 94.14% and an accuracy of 99.09%. |
Tasks | |
Published | 2020-01-05 |
URL | https://arxiv.org/abs/2001.01213v1 |
https://arxiv.org/pdf/2001.01213v1.pdf | |
PWC | https://paperswithcode.com/paper/prediction-of-mri-hardware-failures-based-on |
Repo | |
Framework | |
Rich-Item Recommendations for Rich-Users via GCNN: Exploiting Dynamic and Static Side Information
Title | Rich-Item Recommendations for Rich-Users via GCNN: Exploiting Dynamic and Static Side Information |
Authors | Amar Budhiraja, Gaurush Hiranandani, Navya Yarrabelly, Ayush Choure, Oluwasanmi Koyejo, Prateek Jain |
Abstract | We study the standard problem of recommending relevant items to users; a user is someone who seeks recommendation, and an item is something which should be recommended. In today’s modern world, both users and items are ‘rich’ multi-faceted entities but existing literature, for ease of modeling, views these facets in silos. In this paper, we provide a general formulation of the recommendation problem that captures the complexities of modern systems and encompasses most of the existing recommendation system formulations. In our formulation, each user and item is modeled via a set of static entities and a dynamic component. The relationships between entities are captured by multiple weighted bipartite graphs. To effectively exploit these complex interactions for recommendations, we propose MEDRES – a multiple graph-CNN based novel deep-learning architecture. In addition, we propose a new metric, pAp@k, that is critical for a variety of classification+ranking scenarios. We also provide an optimization algorithm that directly optimizes the proposed metric and trains MEDRES in an end-to-end framework. We demonstrate the effectiveness of our method on two benchmarks as well as on a message recommendation system deployed in Microsoft Teams where it improves upon the existing production-grade model by 3%. |
Tasks | |
Published | 2020-01-28 |
URL | https://arxiv.org/abs/2001.10495v1 |
https://arxiv.org/pdf/2001.10495v1.pdf | |
PWC | https://paperswithcode.com/paper/rich-item-recommendations-for-rich-users-via |
Repo | |
Framework | |
Improved propagation models for lte path loss prediction in urban & suburban Ghana
Title | Improved propagation models for lte path loss prediction in urban & suburban Ghana |
Authors | James D. Gadze, Kwame A. Agyekum, Stephen J. Nuagah, E. A. Affum |
Abstract | To maximize the benefits of LTE cellular networks, careful and proper planning is needed. This requires the use of accurate propagation models to quantify the path loss required for base station deployment. Deployed LTE networks in Ghana can barely meet the desired 100Mbps throughput leading to customer dissatisfaction. Network operators rely on transmission planning tools designed for generalized environments that come with already embedded propagation models suited to other environments. A challenge therefore to Ghanaian transmission Network planners will be choosing an accurate and precise propagation model that best suits the Ghanaian environment. Given this, extensive LTE path loss measurements at 800MHz and 2600MHz were taken in selected urban and suburban environments in Ghana and compared with 6 commonly used propagation models. Improved versions of the Ericson, SUI, and ECC-33 developed in this study predict more precisely the path loss in Ghanaian environments compared with commonly used propagation models. |
Tasks | |
Published | 2020-01-15 |
URL | https://arxiv.org/abs/2001.05227v1 |
https://arxiv.org/pdf/2001.05227v1.pdf | |
PWC | https://paperswithcode.com/paper/improved-propagation-models-for-lte-path-loss |
Repo | |
Framework | |
Salient Facial Features from Humans and Deep Neural Networks
Title | Salient Facial Features from Humans and Deep Neural Networks |
Authors | Shanmeng Sun, Wei Zhen Teoh, Michael Guerzhoy |
Abstract | In this work, we explore the features that are used by humans and by convolutional neural networks (ConvNets) to classify faces. We use Guided Backpropagation (GB) to visualize the facial features that influence the output of a ConvNet the most when identifying specific individuals; we explore how to best use GB for that purpose. We use a human intelligence task to find out which facial features humans find to be the most important for identifying specific individuals. We explore the differences between the saliency information gathered from humans and from ConvNets. Humans develop biases in employing available information on facial features to discriminate across faces. Studies show these biases are influenced both by neurological development and by each individual’s social experience. In recent years the computer vision community has achieved human-level performance in many face processing tasks with deep neural network-based models. These face processing systems are also subject to systematic biases due to model architectural choices and training data distribution. |
Tasks | |
Published | 2020-03-08 |
URL | https://arxiv.org/abs/2003.08765v1 |
https://arxiv.org/pdf/2003.08765v1.pdf | |
PWC | https://paperswithcode.com/paper/salient-facial-features-from-humans-and-deep |
Repo | |
Framework | |
Visual link retrieval and knowledge discovery in painting datasets
Title | Visual link retrieval and knowledge discovery in painting datasets |
Authors | Giovanna Castellano, Eufemia Lella, Gennaro Vessio |
Abstract | Visual arts have invaluable importance for the cultural, historic and economic growth of our societies. One of the building blocks of most analysis in visual arts is to find similarities among paintings of different artists and painting schools. To help art historians better understand visual arts, the present paper presents a framework for visual link retrieval and knowledge discovery in digital painting datasets. The proposed framework is based on a deep convolutional neural network to perform feature extraction and on a fully unsupervised nearest neighbor approach to retrieve visual links among digitized paintings. The fully unsupervised strategy makes attractive the proposed method especially in those cases where metadata are either scarce or unavailable or difficult to collect. In addition, the proposed framework includes a graph analysis that makes it possible to study influences among artists, thus providing historical knowledge discovery. |
Tasks | |
Published | 2020-03-18 |
URL | https://arxiv.org/abs/2003.08476v1 |
https://arxiv.org/pdf/2003.08476v1.pdf | |
PWC | https://paperswithcode.com/paper/visual-link-retrieval-and-knowledge-discovery |
Repo | |
Framework | |
Combining detection and tracking for human pose estimation in videos
Title | Combining detection and tracking for human pose estimation in videos |
Authors | Manchen Wang, Joseph Tighe, Davide Modolo |
Abstract | We propose a novel top-down approach that tackles the problem of multi-person human pose estimation and tracking in videos. In contrast to existing top-down approaches, our method is not limited by the performance of its person detector and can predict the poses of person instances not localized. It achieves this capability by propagating known person locations forward and backward in time and searching for poses in those regions. Our approach consists of three components: (i) a Clip Tracking Network that performs body joint detection and tracking simultaneously on small video clips; (ii) a Video Tracking Pipeline that merges the fixed-length tracklets produced by the Clip Tracking Network to arbitrary length tracks; and (iii) a Spatial-Temporal Merging procedure that refines the joint locations based on spatial and temporal smoothing terms. Thanks to the precision of our Clip Tracking Network and our merging procedure, our approach produces very accurate joint predictions and can fix common mistakes on hard scenarios like heavily entangled people. Our approach achieves state-of-the-art results on both joint detection and tracking, on both the PoseTrack 2017 and 2018 datasets, and against all top-down and bottom-down approaches. |
Tasks | Pose Estimation |
Published | 2020-03-30 |
URL | https://arxiv.org/abs/2003.13743v1 |
https://arxiv.org/pdf/2003.13743v1.pdf | |
PWC | https://paperswithcode.com/paper/combining-detection-and-tracking-for-human |
Repo | |
Framework | |
Certifiable Relative Pose Estimation
Title | Certifiable Relative Pose Estimation |
Authors | Mercedes Garcia-Salguero, Jesus Briales, Javier Gonzalez-Jimenez |
Abstract | In this paper we present the first fast optimality certifier for the non-minimal version of the Relative Pose problem for calibrated cameras from epipolar constraints. The proposed certifier is based on Lagrangian duality and relies on a novel closed-form expression for dual points. We also leverage an efficient solver that performs local optimization on the manifold of the original problem’s non-convex domain. The optimality of the solution is then checked via our novel fast certifier. The extensive conducted experiments demonstrate that, despite its simplicity, this certifiable solver performs excellently on synthetic data, repeatedly attaining the (certified \textit{a posteriori}) optimal solution and shows a satisfactory performance on real data. |
Tasks | Pose Estimation |
Published | 2020-03-30 |
URL | https://arxiv.org/abs/2003.13732v1 |
https://arxiv.org/pdf/2003.13732v1.pdf | |
PWC | https://paperswithcode.com/paper/certifiable-relative-pose-estimation |
Repo | |
Framework | |
Task-Aware Variational Adversarial Active Learning
Title | Task-Aware Variational Adversarial Active Learning |
Authors | Kwanyoung Kim, Dongwon Park, Kwang In Kim, Se Young Chun |
Abstract | Deep learning has achieved remarkable performance in various tasks thanks to massive labeled datasets. However, there are often cases where labeling large amount of data is challenging or infeasible due to high labeling cost such as labeling by experts or long labeling time per large-scale data sample (e.g., video, very large image). Active learning is one of the ways to query the most informative samples to be annotated among massive unlabeled pool. Two promising directions for active learning that have been recently explored are data distribution-based approach to select data points that are far from current labeled pool and model uncertainty-based approach that relies on the perspective of task model. Unfortunately, the former does not exploit structures from tasks and the latter does not seem to well-utilize overall data distribution. Here, we propose the methods that simultaneously take advantage of both data distribution and model uncertainty approaches. Our proposed methods exploit variational adversarial active learning (VAAL), that considered data distribution of both label and unlabeled pools, by incorporating learning loss prediction module and RankCGAN concept into VAAL by modeling loss prediction as a ranker. We demonstrate that our proposed methods outperform recent state-of-the-art active learning methods on various balanced and imbalanced benchmark datasets. |
Tasks | Active Learning |
Published | 2020-02-11 |
URL | https://arxiv.org/abs/2002.04709v1 |
https://arxiv.org/pdf/2002.04709v1.pdf | |
PWC | https://paperswithcode.com/paper/task-aware-variational-adversarial-active |
Repo | |
Framework | |
CoCoPIE: Making Mobile AI Sweet As PIE –Compression-Compilation Co-Design Goes a Long Way
Title | CoCoPIE: Making Mobile AI Sweet As PIE –Compression-Compilation Co-Design Goes a Long Way |
Authors | Shaoshan Liu, Bin Ren, Xipeng Shen, Yanzhi Wang |
Abstract | Assuming hardware is the major constraint for enabling real-time mobile intelligence, the industry has mainly dedicated their efforts to developing specialized hardware accelerators for machine learning and inference. This article challenges the assumption. By drawing on a recent real-time AI optimization framework CoCoPIE, it maintains that with effective compression-compiler co-design, it is possible to enable real-time artificial intelligence on mainstream end devices without special hardware. CoCoPIE is a software framework that holds numerous records on mobile AI: the first framework that supports all main kinds of DNNs, from CNNs to RNNs, transformer, language models, and so on; the fastest DNN pruning and acceleration framework, up to 180X faster compared with current DNN pruning on other frameworks such as TensorFlow-Lite; making many representative AI applications able to run in real-time on off-the-shelf mobile devices that have been previously regarded possible only with special hardware support; making off-the-shelf mobile devices outperform a number of representative ASIC and FPGA solutions in terms of energy efficiency and/or performance. |
Tasks | |
Published | 2020-03-14 |
URL | https://arxiv.org/abs/2003.06700v2 |
https://arxiv.org/pdf/2003.06700v2.pdf | |
PWC | https://paperswithcode.com/paper/cocopie-making-mobile-ai-sweet-as-pie |
Repo | |
Framework | |
Generative Pseudo-label Refinement for Unsupervised Domain Adaptation
Title | Generative Pseudo-label Refinement for Unsupervised Domain Adaptation |
Authors | Pietro Morerio, Riccardo Volpi, Ruggero Ragonesi, Vittorio Murino |
Abstract | We investigate and characterize the inherent resilience of conditional Generative Adversarial Networks (cGANs) against noise in their conditioning labels, and exploit this fact in the context of Unsupervised Domain Adaptation (UDA). In UDA, a classifier trained on the labelled source set can be used to infer pseudo-labels on the unlabelled target set. However, this will result in a significant amount of misclassified examples (due to the well-known domain shift issue), which can be interpreted as noise injection in the ground-truth labels for the target set. We show that cGANs are, to some extent, robust against such “shift noise”. Indeed, cGANs trained with noisy pseudo-labels, are able to filter such noise and generate cleaner target samples. We exploit this finding in an iterative procedure where a generative model and a classifier are jointly trained: in turn, the generator allows to sample cleaner data from the target distribution, and the classifier allows to associate better labels to target samples, progressively refining target pseudo-labels. Results on common benchmarks show that our method performs better or comparably with the unsupervised domain adaptation state of the art. |
Tasks | Domain Adaptation, Unsupervised Domain Adaptation |
Published | 2020-01-09 |
URL | https://arxiv.org/abs/2001.02950v1 |
https://arxiv.org/pdf/2001.02950v1.pdf | |
PWC | https://paperswithcode.com/paper/generative-pseudo-label-refinement-for |
Repo | |
Framework | |