Paper Group ANR 1311
Multi-Sensor 3D Object Box Refinement for Autonomous Driving. SLOAM: Semantic Lidar Odometry and Mapping for Forest Inventory. Semantic Enrichment of Streaming Healthcare Data. Visualizing and Understanding the Effectiveness of BERT. Quadratic Surface Support Vector Machine with L1 Norm Regularization. Constructing a Data Visualization Recommender …
Multi-Sensor 3D Object Box Refinement for Autonomous Driving
Title | Multi-Sensor 3D Object Box Refinement for Autonomous Driving |
Authors | Peiliang Li, Siqi Liu, Shaojie Shen |
Abstract | We propose a 3D object detection system with multi-sensor refinement in the context of autonomous driving. In our framework, the monocular camera serves as the fundamental sensor for 2D object proposal and initial 3D bounding box prediction. While the stereo cameras and LiDAR are treated as adaptive plug-in sensors to refine the 3D box localization performance. For each observed element in the raw measurement domain (e.g., pixels for stereo, 3D points for LiDAR), we model the local geometry as an instance vector representation, which indicates the 3D coordinate of each element respecting to the object frame. Using this unified geometric representation, the 3D object location can be unified refined by the stereo photometric alignment or point cloud alignment. We demonstrate superior 3D detection and localization performance compared to state-of-the-art monocular, stereo methods and competitive performance compared with the baseline LiDAR method on the KITTI object benchmark. |
Tasks | 3D Object Detection, Autonomous Driving, Object Detection |
Published | 2019-09-11 |
URL | https://arxiv.org/abs/1909.04942v2 |
https://arxiv.org/pdf/1909.04942v2.pdf | |
PWC | https://paperswithcode.com/paper/multi-sensor-3d-object-box-refinement-for |
Repo | |
Framework | |
SLOAM: Semantic Lidar Odometry and Mapping for Forest Inventory
Title | SLOAM: Semantic Lidar Odometry and Mapping for Forest Inventory |
Authors | Steven W. Chen, Guilherme V. Nardari, Elijah S. Lee, Chao Qu, Xu Liu, Roseli A. F. Romero, Vijay Kumar |
Abstract | This paper describes an end-to-end pipeline for tree diameter estimation based on semantic segmentation and lidar odometry and mapping. Accurate mapping of this type of environment is challenging since the ground and the trees are surrounded by leaves, thorns and vines, and the sensor typically experiences extreme motion. We propose a semantic feature based pose optimization that simultaneously refines the tree models while estimating the robot pose. The pipeline utilizes a custom virtual reality tool for labeling 3D scans that is used to train a semantic segmentation network. The masked point cloud is used to compute a trellis graph that identifies individual instances and extracts relevant features that are used by the SLAM module. We show that traditional lidar and image based methods fail in the forest environment on both Unmanned Aerial Vehicle (UAV) and hand-carry systems, while our method is more robust, scalable, and automatically generates tree diameter estimations. |
Tasks | Semantic Segmentation |
Published | 2019-12-29 |
URL | https://arxiv.org/abs/1912.12726v1 |
https://arxiv.org/pdf/1912.12726v1.pdf | |
PWC | https://paperswithcode.com/paper/sloam-semantic-lidar-odometry-and-mapping-for |
Repo | |
Framework | |
Semantic Enrichment of Streaming Healthcare Data
Title | Semantic Enrichment of Streaming Healthcare Data |
Authors | Daniel Cotter, V. K. Cody Bumgardner |
Abstract | In the past decade, the healthcare industry has made significant advances in the digitization of patient information. However, a lack of interoperability among healthcare systems still imposes a high cost to patients, hospitals, and insurers. Currently, most systems pass messages using idiosyncratic messaging standards that require specialized knowledge to interpret. This increases the cost of systems integration and often puts more advanced uses of data out of reach. In this project, we demonstrate how two open standards, FHIR and RDF, can be combined both to integrate data from disparate sources in real-time and make that data queryable and susceptible to automated inference. To validate the effectiveness of the semantic engine, we perform simulations of real-time data feeds and demonstrate how they can be combined and used by client-side applications with no knowledge of the underlying sources. |
Tasks | |
Published | 2019-12-01 |
URL | https://arxiv.org/abs/1912.00423v1 |
https://arxiv.org/pdf/1912.00423v1.pdf | |
PWC | https://paperswithcode.com/paper/semantic-enrichment-of-streaming-healthcare |
Repo | |
Framework | |
Visualizing and Understanding the Effectiveness of BERT
Title | Visualizing and Understanding the Effectiveness of BERT |
Authors | Yaru Hao, Li Dong, Furu Wei, Ke Xu |
Abstract | Language model pre-training, such as BERT, has achieved remarkable results in many NLP tasks. However, it is unclear why the pre-training-then-fine-tuning paradigm can improve performance and generalization capability across different tasks. In this paper, we propose to visualize loss landscapes and optimization trajectories of fine-tuning BERT on specific datasets. First, we find that pre-training reaches a good initial point across downstream tasks, which leads to wider optima and easier optimization compared with training from scratch. We also demonstrate that the fine-tuning procedure is robust to overfitting, even though BERT is highly over-parameterized for downstream tasks. Second, the visualization results indicate that fine-tuning BERT tends to generalize better because of the flat and wide optima, and the consistency between the training loss surface and the generalization error surface. Third, the lower layers of BERT are more invariant during fine-tuning, which suggests that the layers that are close to input learn more transferable representations of language. |
Tasks | Language Modelling |
Published | 2019-08-15 |
URL | https://arxiv.org/abs/1908.05620v1 |
https://arxiv.org/pdf/1908.05620v1.pdf | |
PWC | https://paperswithcode.com/paper/visualizing-and-understanding-the |
Repo | |
Framework | |
Quadratic Surface Support Vector Machine with L1 Norm Regularization
Title | Quadratic Surface Support Vector Machine with L1 Norm Regularization |
Authors | Seyedahmad Mousavi, Zheming Gao, Lanshan Han, Alvin Lim |
Abstract | We propose $\ell_1$ norm regularized quadratic surface support vector machine models for binary classification in supervised learning. We establish their desired theoretical properties, including the existence and uniqueness of the optimal solution, reduction to the standard SVMs over (almost) linearly separable data sets, and detection of true sparsity pattern over (almost) quadratically separable data sets if the penalty parameter of $\ell_1$ norm is large enough. We also demonstrate their promising practical efficiency by conducting various numerical experiments on both synthetic and publicly available benchmark data sets. |
Tasks | |
Published | 2019-08-22 |
URL | https://arxiv.org/abs/1908.08616v1 |
https://arxiv.org/pdf/1908.08616v1.pdf | |
PWC | https://paperswithcode.com/paper/quadratic-surface-support-vector-machine-with |
Repo | |
Framework | |
Constructing a Data Visualization Recommender System
Title | Constructing a Data Visualization Recommender System |
Authors | Petra Kubernátová, Magda Friedjungová, Max van Duijn |
Abstract | Choosing a suitable visualization for data is a difficult task. Current data visualization recommender systems exist to aid in choosing a visualization, yet suffer from issues such as low accessibility and indecisiveness. In this study, we first define a step-by-step guide on how to build a data visualization recommender system. We then use this guide to create a model for a data visualization recommender system for non-experts that aims to resolve the issues of current solutions. The result is a question-based model that uses a decision tree and a data visualization classification hierarchy in order to recommend a visualization. Furthermore, it incorporates both task-driven and data characteristics-driven perspectives, whereas existing solutions seem to either convolute these or focus on one of the two exclusively. Based on testing against existing solutions, it is shown that the new model reaches similar results while being simpler, clearer, more versatile, extendable and transparent. The presented guide can be used as a manual for anyone building a data visualization recommender system. The resulting model can be applied in the development of new data visualization software or as part of a learning tool. |
Tasks | Recommendation Systems |
Published | 2019-11-10 |
URL | https://arxiv.org/abs/1911.03871v1 |
https://arxiv.org/pdf/1911.03871v1.pdf | |
PWC | https://paperswithcode.com/paper/constructing-a-data-visualization-recommender |
Repo | |
Framework | |
Understanding spatial correlation in eye-fixation maps for visual attention in videos
Title | Understanding spatial correlation in eye-fixation maps for visual attention in videos |
Authors | Tariq Alshawi, Zhiling Long, Ghassan AlRegib |
Abstract | In this paper, we present an analysis of recorded eye-fixation data from human subjects viewing video sequences. The purpose is to better understand visual attention for videos. Utilizing the eye-fixation data provided in the CRCNS (Collaborative Research in Computational Neuroscience) dataset, this paper focuses on the relation between the saliency of a pixel and that of its direct neighbors, without making any assumption about the structure of the eye-fixation maps. By employing some basic concepts from information theory, the analysis shows substantial correlation between the saliency of a pixel and the saliency of its neighborhood. The analysis also provides insights into the structure and dynamics of the eye-fixation maps, which can be very useful in understanding video saliency and its applications. |
Tasks | |
Published | 2019-01-30 |
URL | http://arxiv.org/abs/1901.10957v1 |
http://arxiv.org/pdf/1901.10957v1.pdf | |
PWC | https://paperswithcode.com/paper/understanding-spatial-correlation-in-eye |
Repo | |
Framework | |
Some New Results for Poisson Binomial Models
Title | Some New Results for Poisson Binomial Models |
Authors | Evan Rosenman |
Abstract | We consider a problem of ecological inference, in which individual-level covariates are known, but labeled data is available only at the aggregate level. The intended application is modeling voter preferences in elections. In Rosenman and Viswanathan (2018), we proposed modeling individual voter probabilities via a logistic regression, and posing the problem as a maximum likelihood estimation for the parameter vector beta. The likelihood is a Poisson binomial, the distribution of the sum of independent but not identically distributed Bernoulli variables, though we approximate it with a heteroscedastic Gaussian for computational efficiency. Here, we extend the prior work by proving results about the existence of the MLE and the curvature of this likelihood, which is not log-concave in general. We further demonstrate the utility of our method on a real data example. Using data on voters in Morris County, NJ, we demonstrate that our approach outperforms other ecological inference methods in predicting a related, but known outcome: whether an individual votes. |
Tasks | |
Published | 2019-07-21 |
URL | https://arxiv.org/abs/1907.09053v1 |
https://arxiv.org/pdf/1907.09053v1.pdf | |
PWC | https://paperswithcode.com/paper/some-new-results-for-poisson-binomial-models |
Repo | |
Framework | |
An Evaluation Framework for Interactive Recommender System
Title | An Evaluation Framework for Interactive Recommender System |
Authors | Oznur Alkan, Elizabeth M. Daly, Adi Botea |
Abstract | Traditional recommender systems present a relatively static list of recommendations to a user where the feedback is typically limited to an accept/reject or a rating model. However, these simple modes of feedback may only provide limited insights as to why a user likes or dislikes an item and what aspects of the item the user has considered. Interactive recommender systems present an opportunity to engage the user in the process by allowing them to interact with the recommendations, provide feedback and impact the results in real-time. Evaluation of the impact of the user interaction typically requires an extensive user study which is time consuming and gives researchers limited opportunities to tune their solutions without having to conduct multiple rounds of user feedback. Additionally, user experience and design aspects can have a significant impact on the user feedback which may result in not necessarily assessing the quality of some of the underlying algorithmic decisions in the overall solution. As a result, we present an evaluation framework which aims to simulate the users interacting with the recommender. We formulate metrics to evaluate the quality of the interactive recommenders which are outputted by the framework once simulation is completed. While simulation along is not sufficient to evaluate a complete solution, the results can be useful to help researchers tune their solution before moving to the user study stage. |
Tasks | Recommendation Systems |
Published | 2019-04-16 |
URL | http://arxiv.org/abs/1904.07765v1 |
http://arxiv.org/pdf/1904.07765v1.pdf | |
PWC | https://paperswithcode.com/paper/an-evaluation-framework-for-interactive |
Repo | |
Framework | |
Multi-Task Learning for Automotive Foggy Scene Understanding via Domain Adaptation to an Illumination-Invariant Representation
Title | Multi-Task Learning for Automotive Foggy Scene Understanding via Domain Adaptation to an Illumination-Invariant Representation |
Authors | Naif Alshammari, Samet Akçay, Toby P. Breckon |
Abstract | Joint scene understanding and segmentation for automotive applications is a challenging problem in two key aspects:- (1) classifying every pixel in the entire scene and (2) performing this task under unstable weather and illumination changes (e.g. foggy weather), which results in poor outdoor scene visibility. This poor outdoor scene visibility leads to a non-optimal performance of deep convolutional neural network-based scene understanding and segmentation. In this paper, we propose an efficient end-to-end contemporary automotive semantic scene understanding approach under foggy weather conditions, employing domain adaptation and illumination-invariant image per-transformation. As a multi-task pipeline, our proposed model provides:- (1) transferring images from extreme to clear-weather condition using domain transfer approach and (2) semantically segmenting a scene using a competitive encoder-decoder convolutional neural network (CNN) with dense connectivity, skip connections and fusion-based techniques. We evaluate our approach on challenging foggy datasets, including synthetic dataset (Foggy Cityscapes) as well as real-world datasets (Foggy Zurich and Foggy Driving). By incorporating RGB, depth, and illumination-invariant information, our approach outperforms the state-of-the-art within automotive scene understanding, under foggy weather condition. |
Tasks | Domain Adaptation, Multi-Task Learning, Scene Understanding |
Published | 2019-09-17 |
URL | https://arxiv.org/abs/1909.07697v1 |
https://arxiv.org/pdf/1909.07697v1.pdf | |
PWC | https://paperswithcode.com/paper/multi-task-learning-for-automotive-foggy |
Repo | |
Framework | |
Machine Learning in IoT Security: Current Solutions and Future Challenges
Title | Machine Learning in IoT Security: Current Solutions and Future Challenges |
Authors | Fatima Hussain, Rasheed Hussain, Syed Ali Hassan, Ekram Hossain |
Abstract | The future Internet of Things (IoT) will have a deep economical, commercial and social impact on our lives. The participating nodes in IoT networks are usually resource-constrained, which makes them luring targets for cyber attacks. In this regard, extensive efforts have been made to address the security and privacy issues in IoT networks primarily through traditional cryptographic approaches. However, the unique characteristics of IoT nodes render the existing solutions insufficient to encompass the entire security spectrum of the IoT networks. This is, at least in part, because of the resource constraints, heterogeneity, massive real-time data generated by the IoT devices, and the extensively dynamic behavior of the networks. Therefore, Machine Learning (ML) and Deep Learning (DL) techniques, which are able to provide embedded intelligence in the IoT devices and networks, are leveraged to cope with different security problems. In this paper, we systematically review the security requirements, attack vectors, and the current security solutions for the IoT networks. We then shed light on the gaps in these security solutions that call for ML and DL approaches. We also discuss in detail the existing ML and DL solutions for addressing different security problems in IoT networks. At last, based on the detailed investigation of the existing solutions in the literature, we discuss the future research directions for ML- and DL-based IoT security. |
Tasks | |
Published | 2019-03-14 |
URL | http://arxiv.org/abs/1904.05735v1 |
http://arxiv.org/pdf/1904.05735v1.pdf | |
PWC | https://paperswithcode.com/paper/machine-learning-in-iot-security-current |
Repo | |
Framework | |
Fused Gromov-Wasserstein Alignment for Hawkes Processes
Title | Fused Gromov-Wasserstein Alignment for Hawkes Processes |
Authors | Dixin Luo, Hongteng Xu, Lawrence Carin |
Abstract | We propose a novel fused Gromov-Wasserstein alignment method to jointly learn the Hawkes processes in different event spaces, and align their event types. Given two Hawkes processes, we use fused Gromov-Wasserstein discrepancy to measure their dissimilarity, which considers both the Wasserstein discrepancy based on their base intensities and the Gromov-Wasserstein discrepancy based on their infectivity matrices. Accordingly, the learned optimal transport reflects the correspondence between the event types of these two Hawkes processes. The Hawkes processes and their optimal transport are learned jointly via maximum likelihood estimation, with a fused Gromov-Wasserstein regularizer. Experimental results show that the proposed method works well on synthetic and real-world data. |
Tasks | |
Published | 2019-10-04 |
URL | https://arxiv.org/abs/1910.02096v1 |
https://arxiv.org/pdf/1910.02096v1.pdf | |
PWC | https://paperswithcode.com/paper/fused-gromov-wasserstein-alignment-for-hawkes |
Repo | |
Framework | |
Tensor Factorization with Label Information for Fake News Detection
Title | Tensor Factorization with Label Information for Fake News Detection |
Authors | Frosso Papanastasiou, Georgios Katsimpras, Georgios Paliouras |
Abstract | The buzz over the so-called “fake news” has created concerns about a degenerated media environment and led to the need for technological solutions. As the detection of fake news is increasingly considered a technological problem, it has attracted considerable research. Most of these studies primarily focus on utilizing information extracted from textual news content. In contrast, we focus on detecting fake news solely based on structural information of social networks. We suggest that the underlying network connections of users that share fake news are discriminative enough to support the detection of fake news. Thereupon, we model each post as a network of friendship interactions and represent a collection of posts as a multidimensional tensor. Taking into account the available labeled data, we propose a tensor factorization method which associates the class labels of data samples with their latent representations. Specifically, we combine a classification error term with the standard factorization in a unified optimization process. Results on real-world datasets demonstrate that our proposed method is competitive against state-of-the-art methods by implementing an arguably simpler approach. |
Tasks | Fake News Detection |
Published | 2019-08-11 |
URL | https://arxiv.org/abs/1908.03957v1 |
https://arxiv.org/pdf/1908.03957v1.pdf | |
PWC | https://paperswithcode.com/paper/tensor-factorization-with-label-information |
Repo | |
Framework | |
DistInit: Learning Video Representations Without a Single Labeled Video
Title | DistInit: Learning Video Representations Without a Single Labeled Video |
Authors | Rohit Girdhar, Du Tran, Lorenzo Torresani, Deva Ramanan |
Abstract | Video recognition models have progressed significantly over the past few years, evolving from shallow classifiers trained on hand-crafted features to deep spatiotemporal networks. However, labeled video data required to train such models have not been able to keep up with the ever-increasing depth and sophistication of these networks. In this work, we propose an alternative approach to learning video representations that require no semantically labeled videos and instead leverages the years of effort in collecting and labeling large and clean still-image datasets. We do so by using state-of-the-art models pre-trained on image datasets as “teachers” to train video models in a distillation framework. We demonstrate that our method learns truly spatiotemporal features, despite being trained only using supervision from still-image networks. Moreover, it learns good representations across different input modalities, using completely uncurated raw video data sources and with different 2D teacher models. Our method obtains strong transfer performance, outperforming standard techniques for bootstrapping video architectures with image-based models by 16%. We believe that our approach opens up new approaches for learning spatiotemporal representations from unlabeled video data. |
Tasks | Action Recognition In Videos, Temporal Action Localization, Video Recognition |
Published | 2019-01-26 |
URL | https://arxiv.org/abs/1901.09244v2 |
https://arxiv.org/pdf/1901.09244v2.pdf | |
PWC | https://paperswithcode.com/paper/distinit-learning-video-representations |
Repo | |
Framework | |
Optimizing Ensemble Weights and Hyperparameters of Machine Learning Models for Regression Problems
Title | Optimizing Ensemble Weights and Hyperparameters of Machine Learning Models for Regression Problems |
Authors | Mohsen Shahhosseini, Guiping Hu, Hieu Pham |
Abstract | Aggregating multiple learners through an ensemble of models aims to make better predictions by capturing the underlying distribution of the data more accurately. Different ensembling methods, such as bagging, boosting and stacking/blending, have been studied and adopted extensively in research and practice. While bagging and boosting focus on reducing variance and bias, respectively, stacking approaches target both by finding the optimal way to combine base learners to find the best trade-off between bias and variance. In stacking with weighted average, ensembles are created from weighted averages of multiple base learners. In this study, a systematic approach is proposed to find the optimal weights to create these ensembles for bias-variance tradeoff using cross-validation for regression problems (Cross-validated Optimal Weighted Ensemble (COWE)). Furthermore, it is known that tuning hyperparameters of each base learner inside the ensemble weight optimization process can produce better performing ensembles. To this end, a nested algorithm based on optimization that considers tuning hyperparameters as well as finding the optimal weights to combine ensembles (Cross-validated Optimal Weighted Ensemble with Internally Tuned Hyperparameters (COWE-ITH)) is proposed. In addition, two heuristic methods based on Bayesian and random search are designed to speed-up the optimizing process. The algorithm is shown to be generalizable to real data sets though analyses with ten publicly available data sets. The prediction accuracies of COWE-ITH and COWE have been compared to base learners and the state-of-art ensemble methods. The results show that COWE-ITH outperforms state-of-art benchmarks as well as base learners in 9 out of 10 data sets. |
Tasks | |
Published | 2019-08-14 |
URL | https://arxiv.org/abs/1908.05287v5 |
https://arxiv.org/pdf/1908.05287v5.pdf | |
PWC | https://paperswithcode.com/paper/optimizing-ensemble-weights-and |
Repo | |
Framework | |