Paper Group ANR 835
Robust Triple-Matrix-Recovery-Based Auto-Weighted Label Propagation for Classification. Secure Evaluation of Quantized Neural Networks. Spiking Neural Network based Region Proposal Networks for Neuromorphic Vision Sensors. Same-Cluster Querying for Overlapping Clusters. Compositional Temporal Visual Grounding of Natural Language Event Descriptions. …
Robust Triple-Matrix-Recovery-Based Auto-Weighted Label Propagation for Classification
Title | Robust Triple-Matrix-Recovery-Based Auto-Weighted Label Propagation for Classification |
Authors | Huan Zhang, Zhao Zhang, Mingbo Zhao, Qiaolin Ye, Min Zhang, Meng Wang |
Abstract | The graph-based semi-supervised label propagation algorithm has delivered impressive classification results. However, the estimated soft labels typically contain mixed signs and noise, which cause inaccurate predictions due to the lack of suitable constraints. Moreover, available methods typically calculate the weights and estimate the labels in the original input space, which typically contains noise and corruption. Thus, the en-coded similarities and manifold smoothness may be inaccurate for label estimation. In this paper, we present effective schemes for resolving these issues and propose a novel and robust semi-supervised classification algorithm, namely, the tri-ple-matrix-recovery-based robust auto-weighted label propa-gation framework (ALP-TMR). Our ALP-TMR introduces a triple matrix recovery mechanism to remove noise or mixed signs from the estimated soft labels and improve the robustness to noise and outliers in the steps of assigning weights and pre-dicting the labels simultaneously. Our method can jointly re-cover the underlying clean data, clean labels and clean weighting spaces by decomposing the original data, predicted soft labels or weights into a clean part plus an error part by fitting noise. In addition, ALP-TMR integrates the au-to-weighting process by minimizing reconstruction errors over the recovered clean data and clean soft labels, which can en-code the weights more accurately to improve both data rep-resentation and classification. By classifying samples in the recovered clean label and weight spaces, one can potentially improve the label prediction results. The results of extensive experiments demonstrated the satisfactory performance of our ALP-TMR. |
Tasks | |
Published | 2019-11-20 |
URL | https://arxiv.org/abs/1911.08678v1 |
https://arxiv.org/pdf/1911.08678v1.pdf | |
PWC | https://paperswithcode.com/paper/robust-triple-matrix-recovery-based-auto |
Repo | |
Framework | |
Secure Evaluation of Quantized Neural Networks
Title | Secure Evaluation of Quantized Neural Networks |
Authors | Anders Dalskov, Daniel Escudero, Marcel Keller |
Abstract | Image classification using Deep Neural Networks that preserve the privacy of both the input image and the model being used, has received considerable attention in the last couple of years. Recent work in this area have shown that it is possible to perform image classification with realistically sized networks using e.g., Garbled Circuits as in XONN (USENIX ‘19) or MPC (CrypTFlow, Eprint ‘19). These, and other prior work, require models to be either trained in a specific way or postprocessed in order to be evaluated securely. We contribute to this line of research by showing that this postprocessing can be handled by standard Machine Learning frameworks. More precisely, we show that quantization as present in Tensorflow suffices to obtain models that can be evaluated directly and as-is in standard off-the-shelve MPC. We implement secure inference of these quantized models in MP-SPDZ, and the generality of our technique means we can demonstrate benchmarks for a wide variety of threat models, something that has not been done before. In particular, we provide a comprehensive comparison between running secure inference of large ImageNet models with active and passive security, as well as honest and dishonest majority. The most efficient inference can be performed using a passive honest majority protocol which takes between 0.9 and 25.8 seconds, depending on the size of the model; for active security and an honest majority, inference is possible between 9.5 and 147.8 seconds. |
Tasks | Image Classification, Quantization |
Published | 2019-10-28 |
URL | https://arxiv.org/abs/1910.12435v1 |
https://arxiv.org/pdf/1910.12435v1.pdf | |
PWC | https://paperswithcode.com/paper/secure-evaluation-of-quantized-neural |
Repo | |
Framework | |
Spiking Neural Network based Region Proposal Networks for Neuromorphic Vision Sensors
Title | Spiking Neural Network based Region Proposal Networks for Neuromorphic Vision Sensors |
Authors | Jyotibdha Acharya, Vandana Padala, Arindam Basu |
Abstract | This paper presents a three layer spiking neural network based region proposal network operating on data generated by neuromorphic vision sensors. The proposed architecture consists of refractory, convolution and clustering layers designed with bio-realistic leaky integrate and fire (LIF) neurons and synapses. The proposed algorithm is tested on traffic scene recordings from a DAVIS sensor setup. The performance of the region proposal network has been compared with event based mean shift algorithm and is found to be far superior (~50% better) in recall for similar precision (~85%). Computational and memory complexity of the proposed method are also shown to be similar to that of event based mean shift |
Tasks | |
Published | 2019-02-26 |
URL | http://arxiv.org/abs/1902.09864v1 |
http://arxiv.org/pdf/1902.09864v1.pdf | |
PWC | https://paperswithcode.com/paper/spiking-neural-network-based-region-proposal |
Repo | |
Framework | |
Same-Cluster Querying for Overlapping Clusters
Title | Same-Cluster Querying for Overlapping Clusters |
Authors | Wasim Huleihel, Arya Mazumdar, Muriel Médard, Soumyabrata Pal |
Abstract | Overlapping clusters are common in models of many practical data-segmentation applications. Suppose we are given $n$ elements to be clustered into $k$ possibly overlapping clusters, and an oracle that can interactively answer queries of the form “do elements $u$ and $v$ belong to the same cluster?” The goal is to recover the clusters with minimum number of such queries. This problem has been of recent interest for the case of disjoint clusters. In this paper, we look at the more practical scenario of overlapping clusters, and provide upper bounds (with algorithms) on the sufficient number of queries. We provide algorithmic results under both arbitrary (worst-case) and statistical modeling assumptions. Our algorithms are parameter free, efficient, and work in the presence of random noise. We also derive information-theoretic lower bounds on the number of queries needed, proving that our algorithms are order optimal. Finally, we test our algorithms over both synthetic and real-world data, showing their practicality and effectiveness. |
Tasks | |
Published | 2019-10-28 |
URL | https://arxiv.org/abs/1910.12490v1 |
https://arxiv.org/pdf/1910.12490v1.pdf | |
PWC | https://paperswithcode.com/paper/same-cluster-querying-for-overlapping |
Repo | |
Framework | |
Compositional Temporal Visual Grounding of Natural Language Event Descriptions
Title | Compositional Temporal Visual Grounding of Natural Language Event Descriptions |
Authors | Jonathan C. Stroud, Ryan McCaffrey, Rada Mihalcea, Jia Deng, Olga Russakovsky |
Abstract | Temporal grounding entails establishing a correspondence between natural language event descriptions and their visual depictions. Compositional modeling becomes central: we first ground atomic descriptions “girl eating an apple,” “batter hitting the ball” to short video segments, and then establish the temporal relationships between the segments. This compositional structure enables models to recognize a wider variety of events not seen during training through recognizing their atomic sub-events. Explicit temporal modeling accounts for a wide variety of temporal relationships that can be expressed in language: e.g., in the description “girl stands up from the table after eating an apple” the visual ordering of the events is reversed, with first “eating an apple” followed by “standing up from the table.” We leverage these observations to develop a unified deep architecture, CTG-Net, to perform temporal grounding of natural language event descriptions to videos. We demonstrate that our system outperforms prior state-of-the-art methods on the DiDeMo, Tempo-TL, and Tempo-HL temporal grounding datasets. |
Tasks | |
Published | 2019-12-04 |
URL | https://arxiv.org/abs/1912.02256v1 |
https://arxiv.org/pdf/1912.02256v1.pdf | |
PWC | https://paperswithcode.com/paper/compositional-temporal-visual-grounding-of |
Repo | |
Framework | |
Waterfall Bandits: Learning to Sell Ads Online
Title | Waterfall Bandits: Learning to Sell Ads Online |
Authors | Branislav Kveton, Saied Mahdian, S. Muthukrishnan, Zheng Wen, Yikun Xian |
Abstract | A popular approach to selling online advertising is by a waterfall, where a publisher makes sequential price offers to ad networks for an inventory, and chooses the winner in that order. The publisher picks the order and prices to maximize her revenue. A traditional solution is to learn the demand model and then subsequently solve the optimization problem for the given demand model. This will incur a linear regret. We design an online learning algorithm for solving this problem, which interleaves learning and optimization, and prove that this algorithm has sublinear regret. We evaluate the algorithm on both synthetic and real-world data, and show that it quickly learns high quality pricing strategies. This is the first principled study of learning a waterfall design online by sequential experimentation. |
Tasks | |
Published | 2019-04-20 |
URL | http://arxiv.org/abs/1904.09404v1 |
http://arxiv.org/pdf/1904.09404v1.pdf | |
PWC | https://paperswithcode.com/paper/190409404 |
Repo | |
Framework | |
KernelNet: A Data-Dependent Kernel Parameterization for Deep Generative Modeling
Title | KernelNet: A Data-Dependent Kernel Parameterization for Deep Generative Modeling |
Authors | Yufan Zhou, Changyou Chen, Jinhui Xu |
Abstract | Learning with kernels is an often resorted tool in modern machine learning. Standard approaches for this type of learning use a predefined kernel that requires careful selection of hyperparameters. To mitigate this burden, we propose in this paper a framework to construct and learn a data-dependent kernel based on random features and implicit spectral distributions (Fourier transform of the kernel) parameterized by deep neural networks. We call the constructed network {\em KernelNet}, and apply it for deep generative modeling in various scenarios, including variants of the MMD-GAN and an implicit Variational Autoencoder (VAE), the two popular learning paradigms in deep generative models. Extensive experiments show the advantages of the proposed KernelNet, consistently achieving better performance compared to related methods. |
Tasks | |
Published | 2019-12-02 |
URL | https://arxiv.org/abs/1912.00979v1 |
https://arxiv.org/pdf/1912.00979v1.pdf | |
PWC | https://paperswithcode.com/paper/kernelnet-a-data-dependent-kernel |
Repo | |
Framework | |
Machine learning in acoustics: theory and applications
Title | Machine learning in acoustics: theory and applications |
Authors | Michael J. Bianco, Peter Gerstoft, James Traer, Emma Ozanich, Marie A. Roch, Sharon Gannot, Charles-Alban Deledalle |
Abstract | Acoustic data provide scientific and engineering insights in fields ranging from biology and communications to ocean and Earth science. We survey the recent advances and transformative potential of machine learning (ML), including deep learning, in the field of acoustics. ML is a broad family of techniques, which are often based in statistics, for automatically detecting and utilizing patterns in data. Relative to conventional acoustics and signal processing, ML is data-driven. Given sufficient training data, ML can discover complex relationships between features and desired labels or actions, or between features themselves. With large volumes of training data, ML can discover models describing complex acoustic phenomena such as human speech and reverberation. ML in acoustics is rapidly developing with compelling results and significant future promise. We first introduce ML, then highlight ML developments in four acoustics research areas: source localization in speech processing, source localization in ocean acoustics, bioacoustics, and environmental sounds in everyday scenes. |
Tasks | |
Published | 2019-05-11 |
URL | https://arxiv.org/abs/1905.04418v4 |
https://arxiv.org/pdf/1905.04418v4.pdf | |
PWC | https://paperswithcode.com/paper/machine-learning-in-acoustics-a-review |
Repo | |
Framework | |
Multi-Angle Point Cloud-VAE: Unsupervised Feature Learning for 3D Point Clouds from Multiple Angles by Joint Self-Reconstruction and Half-to-Half Prediction
Title | Multi-Angle Point Cloud-VAE: Unsupervised Feature Learning for 3D Point Clouds from Multiple Angles by Joint Self-Reconstruction and Half-to-Half Prediction |
Authors | Zhizhong Han, Xiyang Wang, Yu-Shen Liu, Matthias Zwicker |
Abstract | Unsupervised feature learning for point clouds has been vital for large-scale point cloud understanding. Recent deep learning based methods depend on learning global geometry from self-reconstruction. However, these methods are still suffering from ineffective learning of local geometry, which significantly limits the discriminability of learned features. To resolve this issue, we propose MAP-VAE to enable the learning of global and local geometry by jointly leveraging global and local self-supervision. To enable effective local self-supervision, we introduce multi-angle analysis for point clouds. In a multi-angle scenario, we first split a point cloud into a front half and a back half from each angle, and then, train MAP-VAE to learn to predict a back half sequence from the corresponding front half sequence. MAP-VAE performs this half-to-half prediction using RNN to simultaneously learn each local geometry and the spatial relationship among them. In addition, MAP-VAE also learns global geometry via self-reconstruction, where we employ a variational constraint to facilitate novel shape generation. The outperforming results in four shape analysis tasks show that MAP-VAE can learn more discriminative global or local features than the state-of-the-art methods. |
Tasks | |
Published | 2019-07-30 |
URL | https://arxiv.org/abs/1907.12704v1 |
https://arxiv.org/pdf/1907.12704v1.pdf | |
PWC | https://paperswithcode.com/paper/multi-angle-point-cloud-vae-unsupervised |
Repo | |
Framework | |
Hybrid Text Feature Modeling for Disease Group Prediction using Unstructured Physician Notes
Title | Hybrid Text Feature Modeling for Disease Group Prediction using Unstructured Physician Notes |
Authors | Gokul S Krishnan, Sowmya Kamath S |
Abstract | Existing Clinical Decision Support Systems (CDSSs) largely depend on the availability of structured patient data and Electronic Health Records (EHRs) to aid caregivers. However, in case of hospitals in developing countries, structured patient data formats are not widely adopted, where medical professionals still rely on clinical notes in the form of unstructured text. Such unstructured clinical notes recorded by medical personnel can also be a potential source of rich patient-specific information which can be leveraged to build CDSSs, even for hospitals in developing countries. If such unstructured clinical text can be used, the manual and time-consuming process of EHR generation will no longer be required, with huge person-hours and cost savings. In this paper, we propose a generic ICD9 disease group prediction CDSS built on unstructured physician notes modeled using hybrid word embeddings. These word embeddings are used to train a deep neural network for effectively predicting ICD9 disease groups. Experimental evaluation showed that the proposed approach outperformed the state-of-the-art disease group prediction model built on structured EHRs by 15% in terms of AUROC and 40% in terms of AUPRC, thus proving our hypothesis and eliminating dependency on availability of structured patient data. |
Tasks | Word Embeddings |
Published | 2019-11-26 |
URL | https://arxiv.org/abs/1911.11657v1 |
https://arxiv.org/pdf/1911.11657v1.pdf | |
PWC | https://paperswithcode.com/paper/hybrid-text-feature-modeling-for-disease |
Repo | |
Framework | |
City2City: Translating Place Representations across Cities
Title | City2City: Translating Place Representations across Cities |
Authors | Takahiro Yabe, Kota Tsubouchi, Toru Shimizu, Yoshihide Sekimoto, Satish V. Ukkusuri |
Abstract | Large mobility datasets collected from various sources have allowed us to observe, analyze, predict and solve a wide range of important urban challenges. In particular, studies have generated place representations (or embeddings) from mobility patterns in a similar manner to word embeddings to better understand the functionality of different places within a city. However, studies have been limited to generating such representations of cities in an individual manner and has lacked an inter-city perspective, which has made it difficult to transfer the insights gained from the place representations across different cities. In this study, we attempt to bridge this research gap by treating \textit{cities} and \textit{languages} analogously. We apply methods developed for unsupervised machine language translation tasks to translate place representations across different cities. Real world mobility data collected from mobile phone users in 2 cities in Japan are used to test our place representation translation methods. Translated place representations are validated using landuse data, and results show that our methods were able to accurately translate place representations from one city to another. |
Tasks | Word Embeddings |
Published | 2019-11-26 |
URL | https://arxiv.org/abs/1911.12143v1 |
https://arxiv.org/pdf/1911.12143v1.pdf | |
PWC | https://paperswithcode.com/paper/city2city-translating-place-representations |
Repo | |
Framework | |
A Fully-Integrated Sensing and Control System for High-Accuracy Mobile Robotic Building Construction
Title | A Fully-Integrated Sensing and Control System for High-Accuracy Mobile Robotic Building Construction |
Authors | Abel Gawel, Hermann Blum, Johannes Pankert, Koen Krämer, Luca Bartolomei, Selen Ercan, Farbod Farshidian, Margarita Chli, Fabio Gramazio, Roland Siegwart, Marco Hutter, Timothy Sandy |
Abstract | We present a fully-integrated sensing and control system which enables mobile manipulator robots to execute building tasks with millimeter-scale accuracy on building construction sites. The approach leverages multi-modal sensing capabilities for state estimation, tight integration with digital building models, and integrated trajectory planning and whole-body motion control. A novel method for high-accuracy localization updates relative to the known building structure is proposed. The approach is implemented on a real platform and tested under realistic construction conditions. We show that the system can achieve sub-cm end-effector positioning accuracy during fully autonomous operation using solely on-board sensing. |
Tasks | |
Published | 2019-12-04 |
URL | https://arxiv.org/abs/1912.01870v1 |
https://arxiv.org/pdf/1912.01870v1.pdf | |
PWC | https://paperswithcode.com/paper/a-fully-integrated-sensing-and-control-system |
Repo | |
Framework | |
Schedule Earth Observation satellites with Deep Reinforcement Learning
Title | Schedule Earth Observation satellites with Deep Reinforcement Learning |
Authors | Adrien Hadj-Salah, Rémi Verdier, Clément Caron, Mathieu Picard, Mikaël Capelle |
Abstract | Optical Earth observation satellites acquire images worldwide , covering up to several million square kilometers every day. The complexity of scheduling acquisitions for such systems increases exponentially when considering the interoperabil-ity of several satellite constellations together with the uncertainties from weather forecasts. In order to deliver valid images to customers as fast as possible, it is crucial to acquire cloud-free images. Depending on weather forecasts, up to 50% of images acquired by operational satellites can be trashed due to excessive cloud covers, showing there is room for improvement. We propose an acquisition scheduling approach based on Deep Reinforcement Learning and experiment on a simplified environment. We find that it challenges classical methods relying on human-expert heuristic. |
Tasks | |
Published | 2019-11-12 |
URL | https://arxiv.org/abs/1911.05696v1 |
https://arxiv.org/pdf/1911.05696v1.pdf | |
PWC | https://paperswithcode.com/paper/schedule-earth-observation-satellites-with |
Repo | |
Framework | |
Visual Summarization of Scholarly Videos using Word Embeddings and Keyphrase Extraction
Title | Visual Summarization of Scholarly Videos using Word Embeddings and Keyphrase Extraction |
Authors | Hang Zhou, Christian Otto, Ralph Ewerth |
Abstract | Effective learning with audiovisual content depends on many factors. Besides the quality of the learning resource’s content, it is essential to discover the most relevant and suitable video in order to support the learning process most effectively. Video summarization techniques facilitate this goal by providing a quick overview over the content. It is especially useful for longer recordings such as conference presentations or lectures. In this paper, we present an approach that generates a visual summary of video content based on semantic word embeddings and keyphrase extraction. For this purpose, we exploit video annotations that are automatically generated by speech recognition and video OCR (optical character recognition). |
Tasks | Optical Character Recognition, Speech Recognition, Video Summarization, Word Embeddings |
Published | 2019-11-25 |
URL | https://arxiv.org/abs/1912.10809v1 |
https://arxiv.org/pdf/1912.10809v1.pdf | |
PWC | https://paperswithcode.com/paper/visual-summarization-of-scholarly-videos |
Repo | |
Framework | |
A Cost Efficient Approach to Correct OCR Errors in Large Document Collections
Title | A Cost Efficient Approach to Correct OCR Errors in Large Document Collections |
Authors | Deepayan Das, Jerin Philip, Minesh Mathew, C. V. Jawahar |
Abstract | Word error rate of an ocr is often higher than its character error rate. This is especially true when ocrs are designed by recognizing characters. High word accuracies are critical to tasks like the creation of content in digital libraries and text-to-speech applications. In order to detect and correct the misrecognised words, it is common for an ocr module to employ a post-processor to further improve the word accuracy. However, conventional approaches to post-processing like looking up a dictionary or using a statistical language model (slm), are still limited. In many such scenarios, it is often required to remove the outstanding errors manually. We observe that the traditional post-processing schemes look at error words sequentially since ocrs process documents one at a time. We propose a cost-efficient model to address the error words in batches rather than correcting them individually. We exploit the fact that a collection of documents, unlike a single document, has a structure leading to repetition of words. Such words, if efficiently grouped together and corrected as a whole can lead to a significant reduction in the cost. Correction can be fully automatic or with a human in the loop. Towards this, we employ a novel clustering scheme to obtain fairly homogeneous clusters. We compare the performance of our model with various baseline approaches including the case where all the errors are removed by a human. We demonstrate the efficacy of our solution empirically by reporting more than 70% reduction in the human effort with near perfect error correction. We validate our method on Books from multiple languages. |
Tasks | Language Modelling, Optical Character Recognition |
Published | 2019-05-28 |
URL | https://arxiv.org/abs/1905.11739v1 |
https://arxiv.org/pdf/1905.11739v1.pdf | |
PWC | https://paperswithcode.com/paper/a-cost-efficient-approach-to-correct-ocr |
Repo | |
Framework | |