Paper Group ANR 143
What can robotics research learn from computer vision research?. How to choose the most appropriate centrality measure?. Data Parallelism in Training Sparse Neural Networks. Image Embedded Segmentation: Combining Supervised and Unsupervised Objectives through Generative Adversarial Networks. Modeling and Counteracting Exposure Bias in Recommender S …
What can robotics research learn from computer vision research?
Title | What can robotics research learn from computer vision research? |
Authors | Peter Corke, Feras Dayoub, David Hall, John Skinner, Niko Sünderhauf |
Abstract | The computer vision and robotics research communities are each strong. However progress in computer vision has become turbo-charged in recent years due to big data, GPU computing, novel learning algorithms and a very effective research methodology. By comparison, progress in robotics seems slower. It is true that robotics came later to exploring the potential of learning – the advantages over the well-established body of knowledge in dynamics, kinematics, planning and control is still being debated, although reinforcement learning seems to offer real potential. However, the rapid development of computer vision compared to robotics cannot be only attributed to the former’s adoption of deep learning. In this paper, we argue that the gains in computer vision are due to research methodology – evaluation under strict constraints versus experiments; bold numbers versus videos. |
Tasks | |
Published | 2020-01-08 |
URL | https://arxiv.org/abs/2001.02366v1 |
https://arxiv.org/pdf/2001.02366v1.pdf | |
PWC | https://paperswithcode.com/paper/what-can-robotics-research-learn-from |
Repo | |
Framework | |
How to choose the most appropriate centrality measure?
Title | How to choose the most appropriate centrality measure? |
Authors | Pavel Chebotarev, Dmitry Gubanov |
Abstract | We propose a new method to select the most appropriate network centrality measure based on the user’s opinion on how such a measure should work on a set of simple graphs. The method consists in: (1) forming a set $\cal F$ of candidate measures; (2) generating a sequence of sufficiently simple graphs that distinguish all measures in $\cal F$ on some pairs of nodes; (3) compiling a survey with questions on comparing the centrality of test nodes; (4) completing this survey, which provides a centrality measure consistent with all user responses. The developed algorithms make it possible to implement this approach for any finite set $\cal F$ of measures. This paper presents its realization for a set of 40 centrality measures. The proposed method called culling can be used for rapid analysis or combined with a normative approach by compiling a survey on the subset of measures that satisfy certain normative conditions (axioms). In the present study, the latter was done for the subsets determined by the Self-consistency or Bridge axioms. |
Tasks | |
Published | 2020-03-02 |
URL | https://arxiv.org/abs/2003.01052v3 |
https://arxiv.org/pdf/2003.01052v3.pdf | |
PWC | https://paperswithcode.com/paper/how-to-choose-the-most-appropriate-centrality |
Repo | |
Framework | |
Data Parallelism in Training Sparse Neural Networks
Title | Data Parallelism in Training Sparse Neural Networks |
Authors | Namhoon Lee, Philip H. S. Torr, Martin Jaggi |
Abstract | Network pruning is an effective methodology to compress large neural networks, and sparse neural networks obtained by pruning can benefit from their reduced memory and computational costs at use. Notably, recent advances have found that it is possible to find a trainable sparse neural network even at random initialization prior to training; hence the obtained sparse network only needs to be trained. While this approach of pruning at initialization turned out to be highly effective, little has been studied about the training aspects of these sparse neural networks. In this work, we focus on measuring the effects of data parallelism on training sparse neural networks. As a result, we find that the data parallelism in training sparse neural networks is no worse than that in training densely parameterized neural networks, despite the general difficulty of training sparse neural networks. When training sparse networks using SGD with momentum, the breakdown of the perfect scaling regime occurs even much later than the dense at large batch sizes. |
Tasks | Network Pruning |
Published | 2020-03-25 |
URL | https://arxiv.org/abs/2003.11316v1 |
https://arxiv.org/pdf/2003.11316v1.pdf | |
PWC | https://paperswithcode.com/paper/data-parallelism-in-training-sparse-neural |
Repo | |
Framework | |
Image Embedded Segmentation: Combining Supervised and Unsupervised Objectives through Generative Adversarial Networks
Title | Image Embedded Segmentation: Combining Supervised and Unsupervised Objectives through Generative Adversarial Networks |
Authors | C. T. Sari, G. N. Gunesli, C. Sokmensuer, C. Gunduz-Demir |
Abstract | This paper presents a new regularization method to train a fully convolutional network for semantic tissue segmentation in histopathological images. This method relies on benefiting unsupervised learning, in the form of image reconstruction, for the network training. To this end, it puts forward an idea of defining a new embedding that allows uniting the main supervised task of semantic segmentation and an auxiliary unsupervised task of image reconstruction into a single task and proposes to learn this united task by a single generative model. This embedding generates a multi-channel output image by superimposing an original input image on its segmentation map. Then, the method learns to translate the input image to this embedded output image using a conditional generative adversarial network, which is known to be quite effective for image-to-image translations. This proposal is different than the existing approach that uses image reconstruction for the same regularization purpose. The existing approach considers segmentation and image reconstruction as two separate tasks in a multi-task network, defines their losses independently, and then combines these losses in a joint loss function. However, the definition of such a function requires externally determining the right contribution amounts of the supervised and unsupervised losses that yield balanced learning between the segmentation and image reconstruction tasks. The proposed approach eliminates this difficulty by uniting these two tasks into a single one, which intrinsically combines their losses. Using histopathological image segmentation as a showcase application, our experiments demonstrate that this proposed approach leads to better segmentation results. |
Tasks | Image Reconstruction, Semantic Segmentation |
Published | 2020-01-30 |
URL | https://arxiv.org/abs/2001.11202v1 |
https://arxiv.org/pdf/2001.11202v1.pdf | |
PWC | https://paperswithcode.com/paper/image-embedded-segmentation-combining |
Repo | |
Framework | |
Modeling and Counteracting Exposure Bias in Recommender Systems
Title | Modeling and Counteracting Exposure Bias in Recommender Systems |
Authors | Sami Khenissi, Olfa Nasraoui |
Abstract | What we discover and see online, and consequently our opinions and decisions, are becoming increasingly affected by automated machine learned predictions. Similarly, the predictive accuracy of learning machines heavily depends on the feedback data that we provide them. This mutual influence can lead to closed-loop interactions that may cause unknown biases which can be exacerbated after several iterations of machine learning predictions and user feedback. Machine-caused biases risk leading to undesirable social effects ranging from polarization to unfairness and filter bubbles. In this paper, we study the bias inherent in widely used recommendation strategies such as matrix factorization. Then we model the exposure that is borne from the interaction between the user and the recommender system and propose new debiasing strategies for these systems. Finally, we try to mitigate the recommendation system bias by engineering solutions for several state of the art recommender system models. Our results show that recommender systems are biased and depend on the prior exposure of the user. We also show that the studied bias iteratively decreases diversity in the output recommendations. Our debiasing method demonstrates the need for alternative recommendation strategies that take into account the exposure process in order to reduce bias. Our research findings show the importance of understanding the nature of and dealing with bias in machine learning models such as recommender systems that interact directly with humans, and are thus causing an increasing influence on human discovery and decision making |
Tasks | Decision Making, Recommendation Systems |
Published | 2020-01-01 |
URL | https://arxiv.org/abs/2001.04832v1 |
https://arxiv.org/pdf/2001.04832v1.pdf | |
PWC | https://paperswithcode.com/paper/modeling-and-counteracting-exposure-bias-in |
Repo | |
Framework | |
The Direction-Aware, Learnable, Additive Kernels and the Adversarial Network for Deep Floor Plan Recognition
Title | The Direction-Aware, Learnable, Additive Kernels and the Adversarial Network for Deep Floor Plan Recognition |
Authors | Yuli Zhang, Yeyang He, Shaowen Zhu, Xinhan Di |
Abstract | This paper presents a new approach for the recognition of elements in floor plan layouts. Besides of elements with common shapes, we aim to recognize elements with irregular shapes such as circular rooms and inclined walls. Furthermore, the reduction of noise in the semantic segmentation of the floor plan is on demand. To this end, we propose direction-aware, learnable, additive kernels in the application of both the context module and common convolutional blocks. We apply them for high performance of elements with both common and irregular shapes. Besides, an adversarial network with two discriminators is proposed to further improve the accuracy of the elements and to reduce the noise of the semantic segmentation. Experimental results demonstrate the superiority and effectiveness of the proposed network over the state-of-the-art methods. |
Tasks | Semantic Segmentation |
Published | 2020-01-30 |
URL | https://arxiv.org/abs/2001.11194v1 |
https://arxiv.org/pdf/2001.11194v1.pdf | |
PWC | https://paperswithcode.com/paper/the-direction-aware-learnable-additive |
Repo | |
Framework | |
Towards Open-Set Semantic Segmentation of Aerial Images
Title | Towards Open-Set Semantic Segmentation of Aerial Images |
Authors | Caio C. V. da Silva, Keiller Nogueira, Hugo N. Oliveira, Jefersson A. dos Santos |
Abstract | Classical and more recently deep computer vision methods are optimized for visible spectrum images, commonly encoded in grayscale or RGB colorspaces acquired from smartphones or cameras. A more uncommon source of images exploited in the remote sensing field are satellite and aerial images. However, the development of pattern recognition approaches for these data is relatively recent, mainly due to the limited availability of this type of images, as until recently they were used exclusively for military purposes. Access to aerial imagery, including spectral information, has been increasing mainly due to the low cost of drones, cheapening of imaging satellite launch costs, and novel public datasets. Usually remote sensing applications employ computer vision techniques strictly modeled for classification tasks in closed set scenarios. However, real-world tasks rarely fit into closed set contexts, frequently presenting previously unknown classes, characterizing them as open set scenarios. Focusing on this problem, this is the first paper to study and develop semantic segmentation techniques for open set scenarios applied to remote sensing images. The main contributions of this paper are: 1) a discussion of related works in open set semantic segmentation, showing evidence that these techniques can be adapted for open set remote sensing tasks; 2) the development and evaluation of a novel approach for open set semantic segmentation. Our method yielded competitive results when compared to closed set methods for the same dataset. |
Tasks | Semantic Segmentation |
Published | 2020-01-27 |
URL | https://arxiv.org/abs/2001.10063v1 |
https://arxiv.org/pdf/2001.10063v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-open-set-semantic-segmentation-of |
Repo | |
Framework | |
Blockchain meets Biometrics: Concepts, Application to Template Protection, and Trends
Title | Blockchain meets Biometrics: Concepts, Application to Template Protection, and Trends |
Authors | Oscar Delgado-Mohatar, Julian Fierrez, Ruben Tolosana, Ruben Vera-Rodriguez |
Abstract | Blockchain technologies provide excellent architectures and practical tools for securing and managing the sensitive and private data stored in biometric templates, but at a cost. We discuss opportunities and challenges in the integration of blockchain and biometrics, with emphasis in biometric template storage and protection, a key problem in biometrics still largely unsolved. Key tradeoffs involved in that integration, namely, latency, processing time, economic cost, and biometric performance are experimentally studied through the implementation of a smart contract on the Ethereum blockchain platform, which is publicly available in github for research purposes. |
Tasks | |
Published | 2020-03-19 |
URL | https://arxiv.org/abs/2003.09262v1 |
https://arxiv.org/pdf/2003.09262v1.pdf | |
PWC | https://paperswithcode.com/paper/blockchain-meets-biometrics-concepts |
Repo | |
Framework | |
Self-Supervised Spatio-Temporal Representation Learning Using Variable Playback Speed Prediction
Title | Self-Supervised Spatio-Temporal Representation Learning Using Variable Playback Speed Prediction |
Authors | Hyeon Cho, Taehoon Kim, Hyung Jin Chang, Wonjun Hwang |
Abstract | We propose a self-supervised learning method by predicting the variable playback speeds of a video. Without semantic labels, we learn the spatio-temporal representation of the video by leveraging the variations in the visual appearance according to different playback speeds under the assumption of temporal coherence. To learn the spatio-temporal variations in the entire video, we have not only predicted a single playback speed but also generated clips of various playback speeds with randomized starting points. We then train a 3D convolutional network by solving the formulation that sorts the shuffled clips by their playback speed. In this case, the playback speed includes both forward and reverse directions; hence the visual representation can be successfully learned from the directional dynamics of the video. We also propose a novel layer-dependable temporal group normalization method that can be applied to 3D convolutional networks to improve the representation learning performance where we divide the temporal features into several groups and normalize each one using the different corresponding parameters. We validate the effectiveness of the proposed method by fine-tuning it to the action recognition task. The experimental results show that the proposed method outperforms state-of-the-art self-supervised learning methods in action recognition. |
Tasks | Representation Learning |
Published | 2020-03-05 |
URL | https://arxiv.org/abs/2003.02692v1 |
https://arxiv.org/pdf/2003.02692v1.pdf | |
PWC | https://paperswithcode.com/paper/self-supervised-spatio-temporal-2 |
Repo | |
Framework | |
Weighted Empirical Risk Minimization: Sample Selection Bias Correction based on Importance Sampling
Title | Weighted Empirical Risk Minimization: Sample Selection Bias Correction based on Importance Sampling |
Authors | Robin Vogel, Mastane Achab, Stéphan Clémençon, Charles Tillier |
Abstract | We consider statistical learning problems, when the distribution $P'$ of the training observations $Z’_1,; \ldots,; Z’_n$ differs from the distribution $P$ involved in the risk one seeks to minimize (referred to as the test distribution) but is still defined on the same measurable space as $P$ and dominates it. In the unrealistic case where the likelihood ratio $\Phi(z)=dP/dP’(z)$ is known, one may straightforwardly extends the Empirical Risk Minimization (ERM) approach to this specific transfer learning setup using the same idea as that behind Importance Sampling, by minimizing a weighted version of the empirical risk functional computed from the ‘biased’ training data $Z’_i$ with weights $\Phi(Z’_i)$. Although the importance function $\Phi(z)$ is generally unknown in practice, we show that, in various situations frequently encountered in practice, it takes a simple form and can be directly estimated from the $Z’_i$'s and some auxiliary information on the statistical population $P$. By means of linearization techniques, we then prove that the generalization capacity of the approach aforementioned is preserved when plugging the resulting estimates of the $\Phi(Z’_i)$'s into the weighted empirical risk. Beyond these theoretical guarantees, numerical results provide strong empirical evidence of the relevance of the approach promoted in this article. |
Tasks | Transfer Learning |
Published | 2020-02-12 |
URL | https://arxiv.org/abs/2002.05145v2 |
https://arxiv.org/pdf/2002.05145v2.pdf | |
PWC | https://paperswithcode.com/paper/weighted-empirical-risk-minimization-sample |
Repo | |
Framework | |
Siamese Graph Neural Networks for Data Integration
Title | Siamese Graph Neural Networks for Data Integration |
Authors | Evgeny Krivosheev, Mattia Atzeni, Katsiaryna Mirylenka, Paolo Scotton, Fabio Casati |
Abstract | Data integration has been studied extensively for decades and approached from different angles. However, this domain still remains largely rule-driven and lacks universal automation. Recent development in machine learning and in particular deep learning has opened the way to more general and more efficient solutions to data integration problems. In this work, we propose a general approach to modeling and integrating entities from structured data, such as relational databases, as well as unstructured sources, such as free text from news articles. Our approach is designed to explicitly model and leverage relations between entities, thereby using all available information and preserving as much context as possible. This is achieved by combining siamese and graph neural networks to propagate information between connected entities and support high scalability. We evaluate our method on the task of integrating data about business entities, and we demonstrate that it outperforms standard rule-based systems, as well as other deep learning approaches that do not use graph-based representations. |
Tasks | |
Published | 2020-01-17 |
URL | https://arxiv.org/abs/2001.06543v1 |
https://arxiv.org/pdf/2001.06543v1.pdf | |
PWC | https://paperswithcode.com/paper/siamese-graph-neural-networks-for-data |
Repo | |
Framework | |
Aleatoric and Epistemic Uncertainty with Random Forests
Title | Aleatoric and Epistemic Uncertainty with Random Forests |
Authors | Mohammad Hossein Shaker, Eyke Hüllermeier |
Abstract | Due to the steadily increasing relevance of machine learning for practical applications, many of which are coming with safety requirements, the notion of uncertainty has received increasing attention in machine learning research in the last couple of years. In particular, the idea of distinguishing between two important types of uncertainty, often refereed to as aleatoric and epistemic, has recently been studied in the setting of supervised learning. In this paper, we propose to quantify these uncertainties with random forests. More specifically, we show how two general approaches for measuring the learner’s aleatoric and epistemic uncertainty in a prediction can be instantiated with decision trees and random forests as learning algorithms in a classification setting. In this regard, we also compare random forests with deep neural networks, which have been used for a similar purpose. |
Tasks | |
Published | 2020-01-03 |
URL | https://arxiv.org/abs/2001.00893v1 |
https://arxiv.org/pdf/2001.00893v1.pdf | |
PWC | https://paperswithcode.com/paper/aleatoric-and-epistemic-uncertainty-with |
Repo | |
Framework | |
An empirical study of Conv-TasNet
Title | An empirical study of Conv-TasNet |
Authors | Berkan Kadioglu, Michael Horgan, Xiaoyu Liu, Jordi Pons, Dan Darcy, Vivek Kumar |
Abstract | Conv-TasNet is a recently proposed waveform-based deep neural network that achieves state-of-the-art performance in speech source separation. Its architecture consists of a learnable encoder/decoder and a separator that operates on top of this learned space. Various improvements have been proposed to Conv-TasNet. However, they mostly focus on the separator, leaving its encoder/decoder as a (shallow) linear operator. In this paper, we conduct an empirical study of Conv-TasNet and propose an enhancement to the encoder/decoder that is based on a (deep) non-linear variant of it. In addition, we experiment with the larger and more diverse LibriTTS dataset and investigate the generalization capabilities of the studied models when trained on a much larger dataset. We propose cross-dataset evaluation that includes assessing separations from the WSJ0-2mix, LibriTTS and VCTK databases. Our results show that enhancements to the encoder/decoder can improve average SI-SNR performance by more than 1 dB. Furthermore, we offer insights into the generalization capabilities of Conv-TasNet and the potential value of improvements to the encoder/decoder. |
Tasks | |
Published | 2020-02-20 |
URL | https://arxiv.org/abs/2002.08688v2 |
https://arxiv.org/pdf/2002.08688v2.pdf | |
PWC | https://paperswithcode.com/paper/an-empirical-study-of-conv-tasnet |
Repo | |
Framework | |
CNN-based Driver Drowsiness Detection
Title | CNN-based Driver Drowsiness Detection |
Authors | Maryam Hashemi, Alireza Mirrashid, Aliasghar Beheshti Shirazi |
Abstract | This paper presents a novel system for the problem of driver drowsiness detection. In this system, Convolutional Neural Networks (CNN) are used for driver eye monitoring with regarding two goals of real-time application, including high accuracy and fastness, and introduce a new dataset for eye closure detection. Three networks introduced as a potential network for eye status classification in which one of them is a fully designed neural network (FD-NN), and others use transfer learning with VGG16 and VGG19 with extra designed layers (TL-VGG). Lack of an available and accurate eye dataset strongly feels in the area of eye closure detection. Therefore, a new comprehensive dataset proposed. The experimental results show the high accuracy and low computational complexity of the estimations and the ability of the proposed framework on drowsiness detection. |
Tasks | Transfer Learning |
Published | 2020-01-15 |
URL | https://arxiv.org/abs/2001.05137v2 |
https://arxiv.org/pdf/2001.05137v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-based-driver-distraction-and |
Repo | |
Framework | |
Fisheye Distortion Rectification from Deep Straight Lines
Title | Fisheye Distortion Rectification from Deep Straight Lines |
Authors | Zhu-Cun Xue, Nan Xue, Gui-Song Xia |
Abstract | This paper presents a novel line-aware rectification network (LaRecNet) to address the problem of fisheye distortion rectification based on the classical observation that straight lines in 3D space should be still straight in image planes. Specifically, the proposed LaRecNet contains three sequential modules to (1) learn the distorted straight lines from fisheye images; (2) estimate the distortion parameters from the learned heatmaps and the image appearance; and (3) rectify the input images via a proposed differentiable rectification layer. To better train and evaluate the proposed model, we create a synthetic line-rich fisheye (SLF) dataset that contains the distortion parameters and well-annotated distorted straight lines of fisheye images. The proposed method enables us to simultaneously calibrate the geometric distortion parameters and rectify fisheye images. Extensive experiments demonstrate that our model achieves state-of-the-art performance in terms of both geometric accuracy and image quality on several evaluation metrics. In particular, the images rectified by LaRecNet achieve an average reprojection error of 0.33 pixels on the SLF dataset and produce the highest peak signal-to-noise ratio (PSNR) and structure similarity index (SSIM) compared with the groundtruth. |
Tasks | |
Published | 2020-03-25 |
URL | https://arxiv.org/abs/2003.11386v1 |
https://arxiv.org/pdf/2003.11386v1.pdf | |
PWC | https://paperswithcode.com/paper/fisheye-distortion-rectification-from-deep |
Repo | |
Framework | |