Paper Group ANR 575
Nonlinear Markov Clustering by Minimum Curvilinear Sparse Similarity. Learning better generative models for dexterous, single-view grasping of novel objects. Algorithmic Discrimination: Formulation and Exploration in Deep Learning-based Face Biometrics. Robust Morph-Detection at Automated Border Control Gate using Deep Decomposed 3D Shape and Diffu …
Nonlinear Markov Clustering by Minimum Curvilinear Sparse Similarity
Title | Nonlinear Markov Clustering by Minimum Curvilinear Sparse Similarity |
Authors | C. Duran, A. Acevedo, S. Ciucci, A. Muscoloni, CV. Cannistraci |
Abstract | The development of algorithms for unsupervised pattern recognition by nonlinear clustering is a notable problem in data science. Markov clustering (MCL) is a renowned algorithm that simulates stochastic flows on a network of sample similarities to detect the structural organization of clusters in the data, but it has never been generalized to deal with data nonlinearity. Minimum Curvilinearity (MC) is a principle that approximates nonlinear sample distances in the high-dimensional feature space by curvilinear distances, which are computed as transversal paths over their minimum spanning tree, and then stored in a kernel. Here we propose MC-MCL, which is the first nonlinear kernel extension of MCL and exploits Minimum Curvilinearity to enhance the performance of MCL in real and synthetic data with underlying nonlinear patterns. MC-MCL is compared with baseline clustering methods, including DBSCAN, K-means and affinity propagation. We find that Minimum Curvilinearity provides a valuable framework to estimate nonlinear distances also when its kernel is applied in combination with MCL. Indeed, MC-MCL overcomes classical MCL and even baseline clustering algorithms in different nonlinear datasets. |
Tasks | |
Published | 2019-12-27 |
URL | https://arxiv.org/abs/1912.12211v1 |
https://arxiv.org/pdf/1912.12211v1.pdf | |
PWC | https://paperswithcode.com/paper/nonlinear-markov-clustering-by-minimum |
Repo | |
Framework | |
Learning better generative models for dexterous, single-view grasping of novel objects
Title | Learning better generative models for dexterous, single-view grasping of novel objects |
Authors | Marek Kopicki, Dominik Belter, Jeremy L. Wyatt |
Abstract | This paper concerns the problem of how to learn to grasp dexterously, so as to be able to then grasp novel objects seen only from a single view-point. Recently, progress has been made in data-efficient learning of generative grasp models which transfer well to novel objects. These generative grasp models are learned from demonstration (LfD). One weakness is that, as this paper shall show, grasp transfer under challenging single view conditions is unreliable. Second, the number of generative model elements rises linearly in the number of training examples. This, in turn, limits the potential of these generative models for generalisation and continual improvement. In this paper, it is shown how to address these problems. Several technical contributions are made: (i) a view-based model of a grasp; (ii) a method for combining and compressing multiple grasp models; (iii) a new way of evaluating contacts that is used both to generate and to score grasps. These, together, improve both grasp performance and reduce the number of models learned for grasp transfer. These advances, in turn, also allow the introduction of autonomous training, in which the robot learns from self-generated grasps. Evaluation on a challenging test set shows that, with innovations (i)-(iii) deployed, grasp transfer success rises from 55.1% to 81.6%. By adding autonomous training this rises to 87.8%. These differences are statistically significant. In total, across all experiments, 539 test grasps were executed on real objects. |
Tasks | |
Published | 2019-07-13 |
URL | https://arxiv.org/abs/1907.06053v1 |
https://arxiv.org/pdf/1907.06053v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-better-generative-models-for |
Repo | |
Framework | |
Algorithmic Discrimination: Formulation and Exploration in Deep Learning-based Face Biometrics
Title | Algorithmic Discrimination: Formulation and Exploration in Deep Learning-based Face Biometrics |
Authors | Ignacio Serna, Aythami Morales, Julian Fierrez, Manuel Cebrian, Nick Obradovich, Iyad Rahwan |
Abstract | The most popular face recognition benchmarks assume a distribution of subjects without much attention to their demographic attributes. In this work, we perform a comprehensive discrimination-aware experimentation of deep learning-based face recognition. The main aim of this study is focused on a better understanding of the feature space generated by deep models, and the performance achieved over different demographic groups. We also propose a general formulation of algorithmic discrimination with application to face biometrics. The experiments are conducted over the new DiveFace database composed of 24K identities from six different demographic groups. Two popular face recognition models are considered in the experimental framework: ResNet-50 and VGG-Face. We experimentally show that demographic groups highly represented in popular face databases have led to popular pre-trained deep face models presenting strong algorithmic discrimination. That discrimination can be observed both qualitatively at the feature space of the deep models and quantitatively in large performance differences when applying those models in different demographic groups, e.g. for face biometrics. |
Tasks | Face Recognition |
Published | 2019-12-04 |
URL | https://arxiv.org/abs/1912.01842v1 |
https://arxiv.org/pdf/1912.01842v1.pdf | |
PWC | https://paperswithcode.com/paper/algorithmic-discrimination-formulation-and |
Repo | |
Framework | |
Robust Morph-Detection at Automated Border Control Gate using Deep Decomposed 3D Shape and Diffuse Reflectance
Title | Robust Morph-Detection at Automated Border Control Gate using Deep Decomposed 3D Shape and Diffuse Reflectance |
Authors | Jag Mohan Singh, Raghavendra Ramachandra, Kiran B. Raja, Christoph Busch |
Abstract | Face recognition is widely employed in Automated Border Control (ABC) gates, which verify the face image on passport or electronic Machine Readable Travel Document (eMTRD) against the captured image to confirm the identity of the passport holder. In this paper, we present a robust morph detection algorithm that is based on differential morph detection. The proposed method decomposes the bona fide image captured from the ABC gate and the digital face image extracted from the eMRTD into the diffuse reconstructed image and a quantized normal map. The extracted features are further used to learn a linear classifier (SVM) to detect a morphing attack based on the assessment of differences between the bona fide image from the ABC gate and the digital face image extracted from the passport. Owing to the availability of multiple cameras within an ABC gate, we extend the proposed method to fuse the classification scores to generate the final decision on morph-attack-detection. To validate our proposed algorithm, we create a morph attack database with overall 588 images, where bona fide are captured in an indoor lighting environment with a Canon DSLR Camera with one sample per subject and correspondingly images from ABC gates. We benchmark our proposed method with the existing state-of-the-art and can state that the new approach significantly outperforms previous approaches in the ABC gate scenario. |
Tasks | Face Recognition |
Published | 2019-12-03 |
URL | https://arxiv.org/abs/1912.01372v1 |
https://arxiv.org/pdf/1912.01372v1.pdf | |
PWC | https://paperswithcode.com/paper/robust-morph-detection-at-automated-border |
Repo | |
Framework | |
Temporal Logics Over Finite Traces with Uncertainty (Technical Report)
Title | Temporal Logics Over Finite Traces with Uncertainty (Technical Report) |
Authors | Fabrizio M. Maggi, Marco Montali, Rafael Peñaloza |
Abstract | Temporal logics over finite traces have recently seen wide application in a number of areas, from business process modelling, monitoring, and mining to planning and decision making. However, real-life dynamic systems contain a degree of uncertainty which cannot be handled with classical logics. We thus propose a new probabilistic temporal logic over finite traces using superposition semantics, where all possible evolutions are possible, until observed. We study the properties of the logic and provide automata-based mechanisms for deriving probabilistic inferences from its formulas. We then study a fragment of the logic with better computational properties. Notably, formulas in this fragment can be discovered from event log data using off-the-shelf existing declarative process discovery techniques. |
Tasks | Decision Making |
Published | 2019-03-12 |
URL | https://arxiv.org/abs/1903.04940v2 |
https://arxiv.org/pdf/1903.04940v2.pdf | |
PWC | https://paperswithcode.com/paper/probabilistic-temporal-logic-over-finite |
Repo | |
Framework | |
Tighter Problem-Dependent Regret Bounds in Reinforcement Learning without Domain Knowledge using Value Function Bounds
Title | Tighter Problem-Dependent Regret Bounds in Reinforcement Learning without Domain Knowledge using Value Function Bounds |
Authors | Andrea Zanette, Emma Brunskill |
Abstract | Strong worst-case performance bounds for episodic reinforcement learning exist but fortunately in practice RL algorithms perform much better than such bounds would predict. Algorithms and theory that provide strong problem-dependent bounds could help illuminate the key features of what makes a RL problem hard and reduce the barrier to using RL algorithms in practice. As a step towards this we derive an algorithm for finite horizon discrete MDPs and associated analysis that both yields state-of-the art worst-case regret bounds in the dominant terms and yields substantially tighter bounds if the RL environment has small environmental norm, which is a function of the variance of the next-state value functions. An important benefit of our algorithmic is that it does not require apriori knowledge of a bound on the environmental norm. As a result of our analysis, we also help address an open learning theory question~\cite{jiang2018open} about episodic MDPs with a constant upper-bound on the sum of rewards, providing a regret bound with no $H$-dependence in the leading term that scales a polynomial function of the number of episodes. |
Tasks | |
Published | 2019-01-01 |
URL | https://arxiv.org/abs/1901.00210v4 |
https://arxiv.org/pdf/1901.00210v4.pdf | |
PWC | https://paperswithcode.com/paper/tighter-problem-dependent-regret-bounds-in |
Repo | |
Framework | |
Modeling and Interpreting Real-world Human Risk Decision Making with Inverse Reinforcement Learning
Title | Modeling and Interpreting Real-world Human Risk Decision Making with Inverse Reinforcement Learning |
Authors | Quanying Liu, Haiyan Wu, Anqi Liu |
Abstract | We model human decision-making behaviors in a risk-taking task using inverse reinforcement learning (IRL) for the purposes of understanding real human decision making under risk. To the best of our knowledge, this is the first work applying IRL to reveal the implicit reward function in human risk-taking decision making and to interpret risk-prone and risk-averse decision-making policies. We hypothesize that the state history (e.g. rewards and decisions in previous trials) are related to the human reward function, which leads to risk-averse and risk-prone decisions. We design features that reflect these factors in the reward function of IRL and learn the corresponding weight that is interpretable as the importance of features. The results confirm the sub-optimal risk-related decisions of human-driven by the personalized reward function. In particular, the risk-prone person tends to decide based on the current pump number, while the risk-averse person relies on burst information from the previous trial and the average end status. Our results demonstrate that IRL is an effective tool to model human decision-making behavior, as well as to help interpret the human psychological process in risk decision-making. |
Tasks | Decision Making |
Published | 2019-06-13 |
URL | https://arxiv.org/abs/1906.05803v1 |
https://arxiv.org/pdf/1906.05803v1.pdf | |
PWC | https://paperswithcode.com/paper/modeling-and-interpreting-real-world-human |
Repo | |
Framework | |
Public Sphere 2.0: Targeted Commenting in Online News Media
Title | Public Sphere 2.0: Targeted Commenting in Online News Media |
Authors | Ankan Mullick, Sayan Ghosh, Ritam Dutt, Avijit Ghosh, Abhijnan Chakraborty |
Abstract | With the increase in online news consumption, to maximize advertisement revenue, news media websites try to attract and retain their readers on their sites. One of the most effective tools for reader engagement is commenting, where news readers post their views as comments against the news articles. Traditionally, it has been assumed that the comments are mostly made against the full article. In this work, we show that present commenting landscape is far from this assumption. Because the readers lack the time to go over an entire article, most of the comments are relevant to only particular sections of an article. In this paper, we build a system which can automatically classify comments against relevant sections of an article. To implement that, we develop a deep neural network based mechanism to find comments relevant to any section and a paragraph wise commenting interface to showcase them. We believe that such a data driven commenting system can help news websites to further increase reader engagement. |
Tasks | |
Published | 2019-02-21 |
URL | http://arxiv.org/abs/1902.07946v1 |
http://arxiv.org/pdf/1902.07946v1.pdf | |
PWC | https://paperswithcode.com/paper/public-sphere-20-targeted-commenting-in |
Repo | |
Framework | |
Lattice-based lightly-supervised acoustic model training
Title | Lattice-based lightly-supervised acoustic model training |
Authors | Joachim Fainberg, Ondřej Klejch, Steve Renals, Peter Bell |
Abstract | In the broadcast domain there is an abundance of related text data and partial transcriptions, such as closed captions and subtitles. This text data can be used for lightly supervised training, in which text matching the audio is selected using an existing speech recognition model. Current approaches to light supervision typically filter the data based on matching error rates between the transcriptions and biased decoding hypotheses. In contrast, semi-supervised training does not require matching text data, instead generating a hypothesis using a background language model. State-of-the-art semi-supervised training uses lattice-based supervision with the lattice-free MMI (LF-MMI) objective function. We propose a technique to combine inaccurate transcriptions with the lattices generated for semi-supervised training, thus preserving uncertainty in the lattice where appropriate. We demonstrate that this combined approach reduces the expected error rates over the lattices, and reduces the word error rate (WER) on a broadcast task. |
Tasks | Language Modelling, Speech Recognition, Text Matching |
Published | 2019-05-30 |
URL | https://arxiv.org/abs/1905.13150v2 |
https://arxiv.org/pdf/1905.13150v2.pdf | |
PWC | https://paperswithcode.com/paper/lattice-based-lightly-supervised-acoustic |
Repo | |
Framework | |
Neural Machine Translation: A Review
Title | Neural Machine Translation: A Review |
Authors | Felix Stahlberg |
Abstract | The field of machine translation (MT), the automatic translation of written text from one natural language into another, has experienced a major paradigm shift in recent years. Statistical MT, which mainly relies on various count-based models and which used to dominate MT research for decades, has largely been superseded by neural machine translation (NMT), which tackles translation with a single neural network. In this work we will trace back the origins of modern NMT architectures to word and sentence embeddings and earlier examples of the encoder-decoder network family. We will conclude with a survey of recent trends in the field. |
Tasks | Machine Translation, Sentence Embeddings |
Published | 2019-12-04 |
URL | https://arxiv.org/abs/1912.02047v1 |
https://arxiv.org/pdf/1912.02047v1.pdf | |
PWC | https://paperswithcode.com/paper/neural-machine-translation-a-review |
Repo | |
Framework | |
Dense 3D Face Decoding over 2500FPS: Joint Texture & Shape Convolutional Mesh Decoders
Title | Dense 3D Face Decoding over 2500FPS: Joint Texture & Shape Convolutional Mesh Decoders |
Authors | Yuxiang Zhou, Jiankang Deng, Irene Kotsia, Stefanos Zafeiriou |
Abstract | 3D Morphable Models (3DMMs) are statistical models that represent facial texture and shape variations using a set of linear bases and more particular Principal Component Analysis (PCA). 3DMMs were used as statistical priors for reconstructing 3D faces from images by solving non-linear least square optimization problems. Recently, 3DMMs were used as generative models for training non-linear mappings (\ie, regressors) from image to the parameters of the models via Deep Convolutional Neural Networks (DCNNs). Nevertheless, all of the above methods use either fully connected layers or 2D convolutions on parametric unwrapped UV spaces leading to large networks with many parameters. In this paper, we present the first, to the best of our knowledge, non-linear 3DMMs by learning joint texture and shape auto-encoders using direct mesh convolutions. We demonstrate how these auto-encoders can be used to train very light-weight models that perform Coloured Mesh Decoding (CMD) in-the-wild at a speed of over 2500 FPS. |
Tasks | |
Published | 2019-04-06 |
URL | http://arxiv.org/abs/1904.03525v1 |
http://arxiv.org/pdf/1904.03525v1.pdf | |
PWC | https://paperswithcode.com/paper/dense-3d-face-decoding-over-2500fps-joint |
Repo | |
Framework | |
Demystifying Multi-Faceted Video Summarization: Tradeoff Between Diversity,Representation, Coverage and Importance
Title | Demystifying Multi-Faceted Video Summarization: Tradeoff Between Diversity,Representation, Coverage and Importance |
Authors | Vishal Kaushal, Rishabh Iyer, Khoshrav Doctor, Anurag Sahoo, Pratik Dubal, Suraj Kothawade, Rohan Mahadev, Kunal Dargan, Ganesh Ramakrishnan |
Abstract | This paper addresses automatic summarization of videos in a unified manner. In particular, we propose a framework for multi-faceted summarization for extractive, query base and entity summarization (summarization at the level of entities like objects, scenes, humans and faces in the video). We investigate several summarization models which capture notions of diversity, coverage, representation and importance, and argue the utility of these different models depending on the application. While most of the prior work on submodular summarization approaches has focused oncombining several models and learning weighted mixtures, we focus on the explainability of different models and featurizations, and how they apply to different domains. We also provide implementation details on summarization systems and the different modalities involved. We hope that the study from this paper will give insights into practitioners to appropriately choose the right summarization models for the problems at hand. |
Tasks | Video Summarization |
Published | 2019-01-03 |
URL | http://arxiv.org/abs/1901.01153v1 |
http://arxiv.org/pdf/1901.01153v1.pdf | |
PWC | https://paperswithcode.com/paper/demystifying-multi-faceted-video |
Repo | |
Framework | |
Learning More From Less: Towards Strengthening Weak Supervision for Ad-Hoc Retrieval
Title | Learning More From Less: Towards Strengthening Weak Supervision for Ad-Hoc Retrieval |
Authors | Dany Haddad, Joydeep Ghosh |
Abstract | The limited availability of ground truth relevance labels has been a major impediment to the application of supervised methods to ad-hoc retrieval. As a result, unsupervised scoring methods, such as BM25, remain strong competitors to deep learning techniques which have brought on dramatic improvements in other domains, such as computer vision and natural language processing. Recent works have shown that it is possible to take advantage of the performance of these unsupervised methods to generate training data for learning-to-rank models. The key limitation to this line of work is the size of the training set required to surpass the performance of the original unsupervised method, which can be as large as $10^{13}$ training examples. Building on these insights, we propose two methods to reduce the amount of training data required. The first method takes inspiration from crowdsourcing, and leverages multiple unsupervised rankers to generate soft, or noise-aware, training labels. The second identifies harmful, or mislabeled, training examples and removes them from the training set. We show that our methods allow us to surpass the performance of the unsupervised baseline with far fewer training examples than previous works. |
Tasks | Learning-To-Rank |
Published | 2019-07-19 |
URL | https://arxiv.org/abs/1907.08657v1 |
https://arxiv.org/pdf/1907.08657v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-more-from-less-towards-strengthening |
Repo | |
Framework | |
Unbiased Learning to Rank: Counterfactual and Online Approaches
Title | Unbiased Learning to Rank: Counterfactual and Online Approaches |
Authors | Harrie Oosterhuis, Rolf Jagerman, Maarten de Rijke |
Abstract | This tutorial covers and contrasts the two main methodologies in unbiased Learning to Rank (LTR): Counterfactual LTR and Online LTR. There has long been an interest in LTR from user interactions, however, this form of implicit feedback is very biased. In recent years, unbiased LTR methods have been introduced to remove the effect of different types of bias caused by user-behavior in search. For instance, a well addressed type of bias is position bias: the rank at which a document is displayed heavily affects the interactions it receives. Counterfactual LTR methods deal with such types of bias by learning from historical interactions while correcting for the effect of the explicitly modelled biases. Online LTR does not use an explicit user model, in contrast, it learns through an interactive process where randomized results are displayed to the user. Through randomization the effect of different types of bias can be removed from the learning process. Though both methodologies lead to unbiased LTR, their approaches differ considerably, furthermore, so do their theoretical guarantees, empirical results, effects on the user experience during learning, and applicability. Consequently, for practitioners the choice between the two is very substantial. By providing an overview of both approaches and contrasting them, we aim to provide an essential guide to unbiased LTR so as to aid in understanding and choosing between methodologies. |
Tasks | Learning-To-Rank |
Published | 2019-07-16 |
URL | https://arxiv.org/abs/1907.07260v1 |
https://arxiv.org/pdf/1907.07260v1.pdf | |
PWC | https://paperswithcode.com/paper/unbiased-learning-to-rank-counterfactual-and |
Repo | |
Framework | |
Generative adversarial network for segmentation of motion affected neonatal brain MRI
Title | Generative adversarial network for segmentation of motion affected neonatal brain MRI |
Authors | N. Khalili, E. Turk, M. Zreik, M. A. Viergever, M. J. N. L. Benders, I. Isgum |
Abstract | Automatic neonatal brain tissue segmentation in preterm born infants is a prerequisite for evaluation of brain development. However, automatic segmentation is often hampered by motion artifacts caused by infant head movements during image acquisition. Methods have been developed to remove or minimize these artifacts during image reconstruction using frequency domain data. However, frequency domain data might not always be available. Hence, in this study we propose a method for removing motion artifacts from the already reconstructed MR scans. The method employs a generative adversarial network trained with a cycle consistency loss to transform slices affected by motion into slices without motion artifacts, and vice versa. In the experiments 40 T2-weighted coronal MR scans of preterm born infants imaged at 30 weeks postmenstrual age were used. All images contained slices affected by motion artifacts hampering automatic tissue segmentation. To evaluate whether correction allows more accurate image segmentation, the images were segmented into 8 tissue classes: cerebellum, myelinated white matter, basal ganglia and thalami, ventricular cerebrospinal fluid, white matter, brain stem, cortical gray matter, and extracerebral cerebrospinal fluid. Images corrected for motion and corresponding segmentations were qualitatively evaluated using 5-point Likert scale. Before the correction of motion artifacts, median image quality and quality of corresponding automatic segmentations were assigned grade 2 (poor) and 3 (moderate), respectively. After correction of motion artifacts, both improved to grades 3 and 4, respectively. The results indicate that correction of motion artifacts in the image space using the proposed approach allows accurate segmentation of brain tissue classes in slices affected by motion artifacts. |
Tasks | Image Reconstruction, Semantic Segmentation |
Published | 2019-06-11 |
URL | https://arxiv.org/abs/1906.04704v1 |
https://arxiv.org/pdf/1906.04704v1.pdf | |
PWC | https://paperswithcode.com/paper/generative-adversarial-network-for-1 |
Repo | |
Framework | |