Paper Group ANR 869
Class Activation Map Generation by Representative Class Selection and Multi-Layer Feature Fusion. Clustering piecewise stationary processes. Large-Margin Multiple Kernel Learning for Discriminative Features Selection and Representation Learning. Importance-Aware Semantic Segmentation with Efficient Pyramidal Context Network for Navigational Assista …
Class Activation Map Generation by Representative Class Selection and Multi-Layer Feature Fusion
Title | Class Activation Map Generation by Representative Class Selection and Multi-Layer Feature Fusion |
Authors | Fanman Meng, Kaixu Huang, Hongliang Li, Qingbo Wu |
Abstract | Existing method generates class activation map (CAM) by a set of fixed classes (i.e., using all the classes), while the discriminative cues between class pairs are not considered. Note that activation maps by considering different class pair are complementary, and therefore can provide more discriminative cues to overcome the shortcoming of the existing CAM generation that the highlighted regions are usually local part regions rather than global object regions due to the lack of object cues. In this paper, we generate CAM by using a few of representative classes, with aim of extracting more discriminative cues by considering each class pair to obtain CAM more globally. The advantages are twofold. Firstly, the representative classes are able to obtain activation regions that are complementary to each other, and therefore leads to generating activation map more accurately. Secondly, we only need to consider a small number of representative classes, making the CAM generation suitable for small networks. We propose a clustering based method to select the representative classes. Multiple binary classification models rather than a multiple class classification model are used to generate the CAM. Moreover, we propose a multi-layer fusion based CAM generation method to simultaneously combine high-level semantic features and low-level detail features. We validate the proposed method on the PASCAL VOC and COCO database in terms of segmentation groundtruth. Various networks such as classical network (Resnet-50, Resent-101 and Resnet-152) and small network (VGG-19, Resnet-18 and Mobilenet) are considered. Experimental results show that the proposed method improves the CAM generation obviously. |
Tasks | |
Published | 2019-01-23 |
URL | http://arxiv.org/abs/1901.07683v1 |
http://arxiv.org/pdf/1901.07683v1.pdf | |
PWC | https://paperswithcode.com/paper/class-activation-map-generation-by |
Repo | |
Framework | |
Clustering piecewise stationary processes
Title | Clustering piecewise stationary processes |
Authors | Azadeh Khaleghi, Daniil Ryabko |
Abstract | The problem of time-series clustering is considered in the case where each data-point is a sample generated by a piecewise stationary ergodic process. Stationary processes are perhaps the most general class of processes considered in non-parametric statistics and allow for arbitrary long-range dependence between variables. Piecewise stationary processes studied here for the first time in the context of clustering, relax the last remaining assumption in this model: stationarity. A natural formulation is proposed for this problem and a notion of consistency is introduced which requires the samples to be placed in the same cluster if and only if the piecewise stationary distributions that generate them have the same set of stationary distributions. Simple, computationally efficient algorithms are proposed and are shown to be consistent without any additional assumptions beyond piecewise stationarity. |
Tasks | Time Series, Time Series Clustering |
Published | 2019-06-26 |
URL | https://arxiv.org/abs/1906.10921v1 |
https://arxiv.org/pdf/1906.10921v1.pdf | |
PWC | https://paperswithcode.com/paper/clustering-piecewise-stationary-processes |
Repo | |
Framework | |
Large-Margin Multiple Kernel Learning for Discriminative Features Selection and Representation Learning
Title | Large-Margin Multiple Kernel Learning for Discriminative Features Selection and Representation Learning |
Authors | Babak Hosseini, Barbara Hammer |
Abstract | Multiple kernel learning (MKL) algorithms combine different base kernels to obtain a more efficient representation in the feature space. Focusing on discriminative tasks, MKL has been used successfully for feature selection and finding the significant modalities of the data. In such applications, each base kernel represents one dimension of the data or is derived from one specific descriptor. Therefore, MKL finds an optimal weighting scheme for the given kernels to increase the classification accuracy. Nevertheless, the majority of the works in this area focus on only binary classification problems or aim for linear separation of the classes in the kernel space, which are not realistic assumptions for many real-world problems. In this paper, we propose a novel multi-class MKL framework which improves the state-of-the-art by enhancing the local separation of the classes in the feature space. Besides, by using a sparsity term, our large-margin multiple kernel algorithm (LMMK) performs discriminative feature selection by aiming to employ a small subset of the base kernels. Based on our empirical evaluations on different real-world datasets, LMMK provides a competitive classification accuracy compared with the state-of-the-art algorithms in MKL. Additionally, it learns a sparse set of non-zero kernel weights which leads to a more interpretable feature selection and representation learning. |
Tasks | Feature Selection, Representation Learning |
Published | 2019-03-08 |
URL | http://arxiv.org/abs/1903.03364v2 |
http://arxiv.org/pdf/1903.03364v2.pdf | |
PWC | https://paperswithcode.com/paper/large-margin-multiple-kernel-learning-for |
Repo | |
Framework | |
Importance-Aware Semantic Segmentation with Efficient Pyramidal Context Network for Navigational Assistant Systems
Title | Importance-Aware Semantic Segmentation with Efficient Pyramidal Context Network for Navigational Assistant Systems |
Authors | Kaite Xiang, Kaiwei Wang, Kailun Yang |
Abstract | Semantic Segmentation (SS) is a task to assign semantic label to each pixel of the images, which is of immense significance for autonomous vehicles, robotics and assisted navigation of vulnerable road users. It is obvious that in different application scenarios, different objects possess hierarchical importance and safety-relevance, but conventional loss functions like cross entropy have not taken the different levels of importance of diverse traffic elements into consideration. To address this dilemma, we leverage and re-design an importance-aware loss function, throwing insightful hints on how importance of semantics are assigned for real-world applications. To customize semantic segmentation networks for different navigational tasks, we extend ERF-PSPNet, a real-time segmenter designed for wearable device aiding visually impaired pedestrians, and propose BiERF-PSPNet, which can yield high-quality segmentation maps with finer spatial details exceptionally suitable for autonomous vehicles. A comprehensive variety of experiments with these efficient pyramidal context networks on CamVid and Cityscapes datasets demonstrates the effectiveness of our proposal to support diverse navigational assistant systems. |
Tasks | Autonomous Vehicles, Semantic Segmentation |
Published | 2019-07-25 |
URL | https://arxiv.org/abs/1907.11066v2 |
https://arxiv.org/pdf/1907.11066v2.pdf | |
PWC | https://paperswithcode.com/paper/importance-aware-semantic-segmentation-with |
Repo | |
Framework | |
Towards a Flexible Deep Learning Method for Automatic Detection of Clinically Relevant Multi-Modal Events in the Polysomnogram
Title | Towards a Flexible Deep Learning Method for Automatic Detection of Clinically Relevant Multi-Modal Events in the Polysomnogram |
Authors | Alexander Neergaard Olesen, Stanislas Chambon, Valentin Thorey, Poul Jennum, Emmanuel Mignot, Helge B. D. Sorensen |
Abstract | Much attention has been given to automatic sleep staging algorithms in past years, but the detection of discrete events in sleep studies is also crucial for precise characterization of sleep patterns and possible diagnosis of sleep disorders. We propose here a deep learning model for automatic detection and annotation of arousals and leg movements. Both of these are commonly seen during normal sleep, while an excessive amount of either is linked to disrupted sleep patterns, excessive daytime sleepiness impacting quality of life, and various sleep disorders. Our model was trained on 1,485 subjects and tested on 1,000 separate recordings of sleep. We tested two different experimental setups and found optimal arousal detection was attained by including a recurrent neural network module in our default model with a dynamic default event window (F1 = 0.75), while optimal leg movement detection was attained using a static event window (F1 = 0.65). Our work show promise while still allowing for improvements. Specifically, future research will explore the proposed model as a general-purpose sleep analysis model. |
Tasks | Multimodal Sleep Stage Detection |
Published | 2019-05-16 |
URL | https://arxiv.org/abs/1905.08059v1 |
https://arxiv.org/pdf/1905.08059v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-a-flexible-deep-learning-method-for |
Repo | |
Framework | |
Progressive transfer learning for low frequency data prediction in full waveform inversion
Title | Progressive transfer learning for low frequency data prediction in full waveform inversion |
Authors | Wenyi Hu, Yuchen Jin, Xuqing Wu, Jiefu Chen |
Abstract | For the purpose of effective suppression of the cycle-skipping phenomenon in full waveform inversion (FWI), we developed a Deep Neural Network (DNN) approach to predict the absent low-frequency components by exploiting the implicit relation connecting the low-frequency and high-frequency data through the subsurface geological and geophysical properties. In order to solve this challenging nonlinear regression problem, two novel strategies were proposed to design the DNN architecture and the learning workflow: 1) Dual Data Feed; 2) Progressive Transfer Learning. With the Dual Data Feed structure, both the high-frequency data and the corresponding Beat Tone data are fed into the DNN to relieve the burden of feature extraction, thus reducing the network complexity and the training cost. The second strategy, Progressive Transfer Learning, enables us to unbiasedly train the DNN using a single training dataset. Unlike most established deep learning approaches where the training datasets are fixed, within the framework of the Progressive Transfer Learning, the training dataset evolves in an iterative manner while gradually absorbing the subsurface information retrieved by the physics-based inversion module, progressively enhancing the prediction accuracy of the DNN and propelling the FWI process out of the local minima. The Progressive Transfer Learning, alternatingly updating the training velocity model and the DNN parameters in a complementary fashion toward convergence, saves us from being overwhelmed by the otherwise tremendous amount of training data, and avoids the underfitting and biased sampling issues. The numerical experiments validated that, without any a priori geological information, the low-frequency data predicted by the Progressive Transfer Learning are sufficiently accurate for an FWI engine to produce reliable subsurface velocity models free of cycle-skipping-induced artifacts. |
Tasks | Transfer Learning |
Published | 2019-12-20 |
URL | https://arxiv.org/abs/1912.09944v1 |
https://arxiv.org/pdf/1912.09944v1.pdf | |
PWC | https://paperswithcode.com/paper/progressive-transfer-learning-for-low |
Repo | |
Framework | |
Game Theoretic Optimization via Gradient-based Nikaido-Isoda Function
Title | Game Theoretic Optimization via Gradient-based Nikaido-Isoda Function |
Authors | Arvind U. Raghunathan, Anoop Cherian, Devesh K. Jha |
Abstract | Computing Nash equilibrium (NE) of multi-player games has witnessed renewed interest due to recent advances in generative adversarial networks. However, computing equilibrium efficiently is challenging. To this end, we introduce the Gradient-based Nikaido-Isoda (GNI) function which serves: (i) as a merit function, vanishing only at the first-order stationary points of each player’s optimization problem, and (ii) provides error bounds to a stationary Nash point. Gradient descent is shown to converge sublinearly to a first-order stationary point of the GNI function. For the particular case of bilinear min-max games and multi-player quadratic games, the GNI function is convex. Hence, the application of gradient descent in this case yields linear convergence to an NE (when one exists). In our numerical experiments, we observe that the GNI formulation always converges to the first-order stationary point of each player’s optimization problem. |
Tasks | |
Published | 2019-05-15 |
URL | https://arxiv.org/abs/1905.05927v1 |
https://arxiv.org/pdf/1905.05927v1.pdf | |
PWC | https://paperswithcode.com/paper/game-theoretic-optimization-via-gradient |
Repo | |
Framework | |
Adaptive Ensembling: Unsupervised Domain Adaptation for Political Document Analysis
Title | Adaptive Ensembling: Unsupervised Domain Adaptation for Political Document Analysis |
Authors | Shrey Desai, Barea Sinno, Alex Rosenfeld, Junyi Jessy Li |
Abstract | Insightful findings in political science often require researchers to analyze documents of a certain subject or type, yet these documents are usually contained in large corpora that do not distinguish between pertinent and non-pertinent documents. In contrast, we can find corpora that label relevant documents but have limitations (e.g., from a single source or era), preventing their use for political science research. To bridge this gap, we present \textit{adaptive ensembling}, an unsupervised domain adaptation framework, equipped with a novel text classification model and time-aware training to ensure our methods work well with diachronic corpora. Experiments on an expert-annotated dataset show that our framework outperforms strong benchmarks. Further analysis indicates that our methods are more stable, learn better representations, and extract cleaner corpora for fine-grained analysis. |
Tasks | Domain Adaptation, Text Classification, Unsupervised Domain Adaptation |
Published | 2019-10-28 |
URL | https://arxiv.org/abs/1910.12698v1 |
https://arxiv.org/pdf/1910.12698v1.pdf | |
PWC | https://paperswithcode.com/paper/adaptive-ensembling-unsupervised-domain |
Repo | |
Framework | |
Predicting quantum advantage by quantum walk with convolutional neural networks
Title | Predicting quantum advantage by quantum walk with convolutional neural networks |
Authors | Alexey A. Melnikov, Leonid E. Fedichkin, Alexander Alodjants |
Abstract | Quantum walks are at the heart of modern quantum technologies. They allow to deal with quantum transport phenomena and are an advanced tool for constructing novel quantum algorithms. Quantum walks on graphs are fundamentally different from classical random walks analogs, in particular, they walk faster than classical ones on certain graphs, enabling in these cases quantum algorithmic applications and quantum-enhanced energy transfer. However, little is known about the possible advantages on arbitrary graphs not having explicit symmetries. For these graphs one would need to perform simulations of classical and quantum walk dynamics to check if the speedup occurs, which could take a long computational time. Here we present a new approach for the solution of the quantum speedup problem, which is based on a machine learning algorithm that predicts the quantum advantage by just looking at a graph. The convolutional neural network, which we designed specifically to learn from graphs, observes simulated examples and learns complex features of graphs that lead to a quantum advantage, allowing to identify graphs that exhibit quantum advantage without performing any quantum walk or random walk simulations. The performance of our approach is evaluated for line and random graphs, where classification was always better than random guess even for the most challenging cases. Our findings pave the way to an automated elaboration of novel large-scale quantum circuits utilizing quantum walk based algorithms, and to simulating high-efficiency energy transfer in biophotonics and material science. |
Tasks | |
Published | 2019-01-30 |
URL | https://arxiv.org/abs/1901.10632v2 |
https://arxiv.org/pdf/1901.10632v2.pdf | |
PWC | https://paperswithcode.com/paper/detecting-quantum-speedup-by-quantum-walk |
Repo | |
Framework | |
Multiple receptive fields and small-object-focusing weakly-supervised segmentation network for fast object detection
Title | Multiple receptive fields and small-object-focusing weakly-supervised segmentation network for fast object detection |
Authors | Siyang Sun, Yingjie Yin, Xingang Wang, De Xu, Yuan Zhao, Haifeng Shen |
Abstract | Object detection plays an important role in various visual applications. However, the precision and speed of detector are usually contradictory. One main reason for fast detectors’ precision reduction is that small objects are hard to be detected. To address this problem, we propose a multiple receptive field and small-object-focusing weakly-supervised segmentation network (MRFSWSnet) to achieve fast object detection. In MRFSWSnet, multiple receptive fields block (MRF) is used to pay attention to the object and its adjacent background’s different spatial location with different weights to enhance the feature’s discriminability. In addition, in order to improve the accuracy of small object detection, a small-object-focusing weakly-supervised segmentation module which only focuses on small object instead of all objects is integrated into the detection network for auxiliary training to improve the precision of small object detection. Extensive experiments show the effectiveness of our method on both PASCAL VOC and MS COCO detection datasets. In particular, with a lower resolution version of 300x300, MRFSWSnet achieves 80.9% mAP on VOC2007 test with an inference speed of 15 milliseconds per frame, which is the state-of-the-art detector among real-time detectors. |
Tasks | Object Detection, Real-Time Object Detection, Small Object Detection |
Published | 2019-04-19 |
URL | https://arxiv.org/abs/1904.12619v2 |
https://arxiv.org/pdf/1904.12619v2.pdf | |
PWC | https://paperswithcode.com/paper/190412619 |
Repo | |
Framework | |
3D Sensing of a Moving Object with a Nodding 2D LIDAR and Reconfigurable Mirrors
Title | 3D Sensing of a Moving Object with a Nodding 2D LIDAR and Reconfigurable Mirrors |
Authors | Anindya Harchowdhury, Lindsay Kleeman, Leena Vachhani |
Abstract | Perception in 3D has become standard practice for a large part of robotics applications. High quality 3D perception is costly. Our previous work on a nodding 2D Lidar provides high quality 3D depth information with low cost, but the sparse data generated by this sensor poses challenges in understanding the characteristics of moving objects within an uncertain environment. This paper proposes a novel design of the nodding Lidar but provides dynamic reconfigurability in terms of limiting the field of view of the sensor using a set of optical mirrors. It not only provides denser scans, but it also achieves a three times higher scan update rate. Additionally, we propose a novel calibration mechanism for this sensor and prove its effectiveness for dynamic object detection and tracking. |
Tasks | Calibration, Object Detection |
Published | 2019-12-27 |
URL | https://arxiv.org/abs/1912.13461v1 |
https://arxiv.org/pdf/1912.13461v1.pdf | |
PWC | https://paperswithcode.com/paper/3d-sensing-of-a-moving-object-with-a-nodding |
Repo | |
Framework | |
Synthetic Humans for Action Recognition from Unseen Viewpoints
Title | Synthetic Humans for Action Recognition from Unseen Viewpoints |
Authors | Gül Varol, Ivan Laptev, Cordelia Schmid, Andrew Zisserman |
Abstract | Our goal in this work is to improve the performance of human action recognition for viewpoints unseen during training by using synthetic training data. Although synthetic data has been shown to be beneficial for tasks such as human pose estimation, its use for RGB human action recognition is relatively unexplored. We make use of the recent advances in monocular 3D human body reconstruction from real action sequences to automatically render synthetic training videos for the action labels. We make the following contributions: (i) we investigate the extent of variations and augmentations that are beneficial to improving performance at new viewpoints. We consider changes in body shape and clothing for individuals, as well as more action relevant augmentations such as non-uniform frame sampling, and interpolating between the motion of individuals performing the same action; (ii) We introduce a new dataset, SURREACT, that allows supervised training of spatio-temporal CNNs for action classification; (iii) We substantially improve the state-of-the-art action recognition performance on the NTU RGB+D and UESTC standard human action multi-view benchmarks; Finally, (iv) we extend the augmentation approach to in-the-wild videos from a subset of the Kinetics dataset to investigate the case when only one-shot training data is available, and demonstrate improvements in this case as well. |
Tasks | Action Classification, Pose Estimation, Temporal Action Localization |
Published | 2019-12-09 |
URL | https://arxiv.org/abs/1912.04070v1 |
https://arxiv.org/pdf/1912.04070v1.pdf | |
PWC | https://paperswithcode.com/paper/synthetic-humans-for-action-recognition-from |
Repo | |
Framework | |
Epistemic Risk-Sensitive Reinforcement Learning
Title | Epistemic Risk-Sensitive Reinforcement Learning |
Authors | Hannes Eriksson, Christos Dimitrakakis |
Abstract | We develop a framework for interacting with uncertain environments in reinforcement learning (RL) by leveraging preferences in the form of utility functions. We claim that there is value in considering different risk measures during learning. In this framework, the preference for risk can be tuned by variation of the parameter $\beta$ and the resulting behavior can be risk-averse, risk-neutral or risk-taking depending on the parameter choice. We evaluate our framework for learning problems with model uncertainty. We measure and control for \emph{epistemic} risk using dynamic programming (DP) and policy gradient-based algorithms. The risk-averse behavior is then compared with the behavior of the optimal risk-neutral policy in environments with epistemic risk. |
Tasks | |
Published | 2019-06-14 |
URL | https://arxiv.org/abs/1906.06273v1 |
https://arxiv.org/pdf/1906.06273v1.pdf | |
PWC | https://paperswithcode.com/paper/epistemic-risk-sensitive-reinforcement |
Repo | |
Framework | |
Multi-Person Pose Estimation with Enhanced Channel-wise and Spatial Information
Title | Multi-Person Pose Estimation with Enhanced Channel-wise and Spatial Information |
Authors | Kai Su, Dongdong Yu, Zhenqi Xu, Xin Geng, Changhu Wang |
Abstract | Multi-person pose estimation is an important but challenging problem in computer vision. Although current approaches have achieved significant progress by fusing the multi-scale feature maps, they pay little attention to enhancing the channel-wise and spatial information of the feature maps. In this paper, we propose two novel modules to perform the enhancement of the information for the multi-person pose estimation. First, a Channel Shuffle Module (CSM) is proposed to adopt the channel shuffle operation on the feature maps with different levels, promoting cross-channel information communication among the pyramid feature maps. Second, a Spatial, Channel-wise Attention Residual Bottleneck (SCARB) is designed to boost the original residual unit with attention mechanism, adaptively highlighting the information of the feature maps both in the spatial and channel-wise context. The effectiveness of our proposed modules is evaluated on the COCO keypoint benchmark, and experimental results show that our approach achieves the state-of-the-art results. |
Tasks | Multi-Person Pose Estimation, Pose Estimation |
Published | 2019-05-09 |
URL | https://arxiv.org/abs/1905.03466v1 |
https://arxiv.org/pdf/1905.03466v1.pdf | |
PWC | https://paperswithcode.com/paper/190503466 |
Repo | |
Framework | |
Job Recommendation through Progression of Job Selection
Title | Job Recommendation through Progression of Job Selection |
Authors | Amber Nigam, Aakash Roy, Hartaran Singh, Aabhas Tonwer |
Abstract | Job recommendation has traditionally been treated as a filter-based match or as a recommendation based on the features of jobs and candidates as discrete entities. In this paper, we introduce a methodology where we leverage the progression of job selection by candidates using machine learning. Additionally, our recommendation is composed of several other sub-recommendations that contribute to at least one of a) making recommendations serendipitous for the end user b) overcoming cold-start for both candidates and jobs. One of the unique selling propositions of our methodology is the way we have used skills as embedded features and derived latent competencies from them, thereby attempting to expand the skills of candidates and jobs to achieve more coverage in the skill domain. We have deployed our model in a real-world job recommender system and have achieved the best click-through rate through a blended approach of machine-learned recommendations and other sub-recommendations. For recommending jobs through machine learning that forms a significant part of our recommendation, we achieve the best results through Bi-LSTM with attention. |
Tasks | Recommendation Systems |
Published | 2019-05-28 |
URL | https://arxiv.org/abs/1905.13136v1 |
https://arxiv.org/pdf/1905.13136v1.pdf | |
PWC | https://paperswithcode.com/paper/job-recommendation-through-progression-of-job |
Repo | |
Framework | |