January 29, 2020

3345 words 16 mins read

Paper Group ANR 575

Nonlinear Markov Clustering by Minimum Curvilinear Sparse Similarity. Learning better generative models for dexterous, single-view grasping of novel objects. Algorithmic Discrimination: Formulation and Exploration in Deep Learning-based Face Biometrics. Robust Morph-Detection at Automated Border Control Gate using Deep Decomposed 3D Shape and Diffu …

Nonlinear Markov Clustering by Minimum Curvilinear Sparse Similarity


Title	Nonlinear Markov Clustering by Minimum Curvilinear Sparse Similarity
Authors	C. Duran, A. Acevedo, S. Ciucci, A. Muscoloni, CV. Cannistraci
Abstract	The development of algorithms for unsupervised pattern recognition by nonlinear clustering is a notable problem in data science. Markov clustering (MCL) is a renowned algorithm that simulates stochastic flows on a network of sample similarities to detect the structural organization of clusters in the data, but it has never been generalized to deal with data nonlinearity. Minimum Curvilinearity (MC) is a principle that approximates nonlinear sample distances in the high-dimensional feature space by curvilinear distances, which are computed as transversal paths over their minimum spanning tree, and then stored in a kernel. Here we propose MC-MCL, which is the first nonlinear kernel extension of MCL and exploits Minimum Curvilinearity to enhance the performance of MCL in real and synthetic data with underlying nonlinear patterns. MC-MCL is compared with baseline clustering methods, including DBSCAN, K-means and affinity propagation. We find that Minimum Curvilinearity provides a valuable framework to estimate nonlinear distances also when its kernel is applied in combination with MCL. Indeed, MC-MCL overcomes classical MCL and even baseline clustering algorithms in different nonlinear datasets.
Tasks
Published	2019-12-27
URL	https://arxiv.org/abs/1912.12211v1
PDF	https://arxiv.org/pdf/1912.12211v1.pdf
PWC	https://paperswithcode.com/paper/nonlinear-markov-clustering-by-minimum
Repo
Framework

Learning better generative models for dexterous, single-view grasping of novel objects


Title	Learning better generative models for dexterous, single-view grasping of novel objects
Authors	Marek Kopicki, Dominik Belter, Jeremy L. Wyatt
Abstract	This paper concerns the problem of how to learn to grasp dexterously, so as to be able to then grasp novel objects seen only from a single view-point. Recently, progress has been made in data-efficient learning of generative grasp models which transfer well to novel objects. These generative grasp models are learned from demonstration (LfD). One weakness is that, as this paper shall show, grasp transfer under challenging single view conditions is unreliable. Second, the number of generative model elements rises linearly in the number of training examples. This, in turn, limits the potential of these generative models for generalisation and continual improvement. In this paper, it is shown how to address these problems. Several technical contributions are made: (i) a view-based model of a grasp; (ii) a method for combining and compressing multiple grasp models; (iii) a new way of evaluating contacts that is used both to generate and to score grasps. These, together, improve both grasp performance and reduce the number of models learned for grasp transfer. These advances, in turn, also allow the introduction of autonomous training, in which the robot learns from self-generated grasps. Evaluation on a challenging test set shows that, with innovations (i)-(iii) deployed, grasp transfer success rises from 55.1% to 81.6%. By adding autonomous training this rises to 87.8%. These differences are statistically significant. In total, across all experiments, 539 test grasps were executed on real objects.
Tasks
Published	2019-07-13
URL	https://arxiv.org/abs/1907.06053v1
PDF	https://arxiv.org/pdf/1907.06053v1.pdf
PWC	https://paperswithcode.com/paper/learning-better-generative-models-for
Repo
Framework

Algorithmic Discrimination: Formulation and Exploration in Deep Learning-based Face Biometrics


Title	Algorithmic Discrimination: Formulation and Exploration in Deep Learning-based Face Biometrics
Authors	Ignacio Serna, Aythami Morales, Julian Fierrez, Manuel Cebrian, Nick Obradovich, Iyad Rahwan
Abstract	The most popular face recognition benchmarks assume a distribution of subjects without much attention to their demographic attributes. In this work, we perform a comprehensive discrimination-aware experimentation of deep learning-based face recognition. The main aim of this study is focused on a better understanding of the feature space generated by deep models, and the performance achieved over different demographic groups. We also propose a general formulation of algorithmic discrimination with application to face biometrics. The experiments are conducted over the new DiveFace database composed of 24K identities from six different demographic groups. Two popular face recognition models are considered in the experimental framework: ResNet-50 and VGG-Face. We experimentally show that demographic groups highly represented in popular face databases have led to popular pre-trained deep face models presenting strong algorithmic discrimination. That discrimination can be observed both qualitatively at the feature space of the deep models and quantitatively in large performance differences when applying those models in different demographic groups, e.g. for face biometrics.
Tasks	Face Recognition
Published	2019-12-04
URL	https://arxiv.org/abs/1912.01842v1
PDF	https://arxiv.org/pdf/1912.01842v1.pdf
PWC	https://paperswithcode.com/paper/algorithmic-discrimination-formulation-and
Repo
Framework

Robust Morph-Detection at Automated Border Control Gate using Deep Decomposed 3D Shape and Diffuse Reflectance


Title	Robust Morph-Detection at Automated Border Control Gate using Deep Decomposed 3D Shape and Diffuse Reflectance
Authors	Jag Mohan Singh, Raghavendra Ramachandra, Kiran B. Raja, Christoph Busch
Abstract	Face recognition is widely employed in Automated Border Control (ABC) gates, which verify the face image on passport or electronic Machine Readable Travel Document (eMTRD) against the captured image to confirm the identity of the passport holder. In this paper, we present a robust morph detection algorithm that is based on differential morph detection. The proposed method decomposes the bona fide image captured from the ABC gate and the digital face image extracted from the eMRTD into the diffuse reconstructed image and a quantized normal map. The extracted features are further used to learn a linear classifier (SVM) to detect a morphing attack based on the assessment of differences between the bona fide image from the ABC gate and the digital face image extracted from the passport. Owing to the availability of multiple cameras within an ABC gate, we extend the proposed method to fuse the classification scores to generate the final decision on morph-attack-detection. To validate our proposed algorithm, we create a morph attack database with overall 588 images, where bona fide are captured in an indoor lighting environment with a Canon DSLR Camera with one sample per subject and correspondingly images from ABC gates. We benchmark our proposed method with the existing state-of-the-art and can state that the new approach significantly outperforms previous approaches in the ABC gate scenario.
Tasks	Face Recognition
Published	2019-12-03
URL	https://arxiv.org/abs/1912.01372v1
PDF	https://arxiv.org/pdf/1912.01372v1.pdf
PWC	https://paperswithcode.com/paper/robust-morph-detection-at-automated-border
Repo
Framework

Temporal Logics Over Finite Traces with Uncertainty (Technical Report)


Title	Temporal Logics Over Finite Traces with Uncertainty (Technical Report)
Authors	Fabrizio M. Maggi, Marco Montali, Rafael Peñaloza
Abstract	Temporal logics over finite traces have recently seen wide application in a number of areas, from business process modelling, monitoring, and mining to planning and decision making. However, real-life dynamic systems contain a degree of uncertainty which cannot be handled with classical logics. We thus propose a new probabilistic temporal logic over finite traces using superposition semantics, where all possible evolutions are possible, until observed. We study the properties of the logic and provide automata-based mechanisms for deriving probabilistic inferences from its formulas. We then study a fragment of the logic with better computational properties. Notably, formulas in this fragment can be discovered from event log data using off-the-shelf existing declarative process discovery techniques.
Tasks	Decision Making
Published	2019-03-12
URL	https://arxiv.org/abs/1903.04940v2
PDF	https://arxiv.org/pdf/1903.04940v2.pdf
PWC	https://paperswithcode.com/paper/probabilistic-temporal-logic-over-finite
Repo
Framework

Tighter Problem-Dependent Regret Bounds in Reinforcement Learning without Domain Knowledge using Value Function Bounds


Title	Tighter Problem-Dependent Regret Bounds in Reinforcement Learning without Domain Knowledge using Value Function Bounds
Authors	Andrea Zanette, Emma Brunskill
Abstract	Strong worst-case performance bounds for episodic reinforcement learning exist but fortunately in practice RL algorithms perform much better than such bounds would predict. Algorithms and theory that provide strong problem-dependent bounds could help illuminate the key features of what makes a RL problem hard and reduce the barrier to using RL algorithms in practice. As a step towards this we derive an algorithm for finite horizon discrete MDPs and associated analysis that both yields state-of-the art worst-case regret bounds in the dominant terms and yields substantially tighter bounds if the RL environment has small environmental norm, which is a function of the variance of the next-state value functions. An important benefit of our algorithmic is that it does not require apriori knowledge of a bound on the environmental norm. As a result of our analysis, we also help address an open learning theory question~\cite{jiang2018open} about episodic MDPs with a constant upper-bound on the sum of rewards, providing a regret bound with no $H$-dependence in the leading term that scales a polynomial function of the number of episodes.
Tasks
Published	2019-01-01
URL	https://arxiv.org/abs/1901.00210v4
PDF	https://arxiv.org/pdf/1901.00210v4.pdf
PWC	https://paperswithcode.com/paper/tighter-problem-dependent-regret-bounds-in
Repo
Framework

Modeling and Interpreting Real-world Human Risk Decision Making with Inverse Reinforcement Learning


Title	Modeling and Interpreting Real-world Human Risk Decision Making with Inverse Reinforcement Learning
Authors	Quanying Liu, Haiyan Wu, Anqi Liu
Abstract	We model human decision-making behaviors in a risk-taking task using inverse reinforcement learning (IRL) for the purposes of understanding real human decision making under risk. To the best of our knowledge, this is the first work applying IRL to reveal the implicit reward function in human risk-taking decision making and to interpret risk-prone and risk-averse decision-making policies. We hypothesize that the state history (e.g. rewards and decisions in previous trials) are related to the human reward function, which leads to risk-averse and risk-prone decisions. We design features that reflect these factors in the reward function of IRL and learn the corresponding weight that is interpretable as the importance of features. The results confirm the sub-optimal risk-related decisions of human-driven by the personalized reward function. In particular, the risk-prone person tends to decide based on the current pump number, while the risk-averse person relies on burst information from the previous trial and the average end status. Our results demonstrate that IRL is an effective tool to model human decision-making behavior, as well as to help interpret the human psychological process in risk decision-making.
Tasks	Decision Making
Published	2019-06-13
URL	https://arxiv.org/abs/1906.05803v1
PDF	https://arxiv.org/pdf/1906.05803v1.pdf
PWC	https://paperswithcode.com/paper/modeling-and-interpreting-real-world-human
Repo
Framework

Public Sphere 2.0: Targeted Commenting in Online News Media


Title	Public Sphere 2.0: Targeted Commenting in Online News Media
Authors	Ankan Mullick, Sayan Ghosh, Ritam Dutt, Avijit Ghosh, Abhijnan Chakraborty
Abstract	With the increase in online news consumption, to maximize advertisement revenue, news media websites try to attract and retain their readers on their sites. One of the most effective tools for reader engagement is commenting, where news readers post their views as comments against the news articles. Traditionally, it has been assumed that the comments are mostly made against the full article. In this work, we show that present commenting landscape is far from this assumption. Because the readers lack the time to go over an entire article, most of the comments are relevant to only particular sections of an article. In this paper, we build a system which can automatically classify comments against relevant sections of an article. To implement that, we develop a deep neural network based mechanism to find comments relevant to any section and a paragraph wise commenting interface to showcase them. We believe that such a data driven commenting system can help news websites to further increase reader engagement.
Tasks
Published	2019-02-21
URL	http://arxiv.org/abs/1902.07946v1
PDF	http://arxiv.org/pdf/1902.07946v1.pdf
PWC	https://paperswithcode.com/paper/public-sphere-20-targeted-commenting-in
Repo
Framework

Lattice-based lightly-supervised acoustic model training


Title	Lattice-based lightly-supervised acoustic model training
Authors	Joachim Fainberg, Ondřej Klejch, Steve Renals, Peter Bell
Abstract	In the broadcast domain there is an abundance of related text data and partial transcriptions, such as closed captions and subtitles. This text data can be used for lightly supervised training, in which text matching the audio is selected using an existing speech recognition model. Current approaches to light supervision typically filter the data based on matching error rates between the transcriptions and biased decoding hypotheses. In contrast, semi-supervised training does not require matching text data, instead generating a hypothesis using a background language model. State-of-the-art semi-supervised training uses lattice-based supervision with the lattice-free MMI (LF-MMI) objective function. We propose a technique to combine inaccurate transcriptions with the lattices generated for semi-supervised training, thus preserving uncertainty in the lattice where appropriate. We demonstrate that this combined approach reduces the expected error rates over the lattices, and reduces the word error rate (WER) on a broadcast task.
Tasks	Language Modelling, Speech Recognition, Text Matching
Published	2019-05-30
URL	https://arxiv.org/abs/1905.13150v2
PDF	https://arxiv.org/pdf/1905.13150v2.pdf
PWC	https://paperswithcode.com/paper/lattice-based-lightly-supervised-acoustic
Repo
Framework

Neural Machine Translation: A Review


Title	Neural Machine Translation: A Review
Authors	Felix Stahlberg
Abstract	The field of machine translation (MT), the automatic translation of written text from one natural language into another, has experienced a major paradigm shift in recent years. Statistical MT, which mainly relies on various count-based models and which used to dominate MT research for decades, has largely been superseded by neural machine translation (NMT), which tackles translation with a single neural network. In this work we will trace back the origins of modern NMT architectures to word and sentence embeddings and earlier examples of the encoder-decoder network family. We will conclude with a survey of recent trends in the field.
Tasks	Machine Translation, Sentence Embeddings
Published	2019-12-04
URL	https://arxiv.org/abs/1912.02047v1
PDF	https://arxiv.org/pdf/1912.02047v1.pdf
PWC	https://paperswithcode.com/paper/neural-machine-translation-a-review
Repo
Framework

Dense 3D Face Decoding over 2500FPS: Joint Texture & Shape Convolutional Mesh Decoders


Title	Dense 3D Face Decoding over 2500FPS: Joint Texture & Shape Convolutional Mesh Decoders
Authors	Yuxiang Zhou, Jiankang Deng, Irene Kotsia, Stefanos Zafeiriou
Abstract	3D Morphable Models (3DMMs) are statistical models that represent facial texture and shape variations using a set of linear bases and more particular Principal Component Analysis (PCA). 3DMMs were used as statistical priors for reconstructing 3D faces from images by solving non-linear least square optimization problems. Recently, 3DMMs were used as generative models for training non-linear mappings (\ie, regressors) from image to the parameters of the models via Deep Convolutional Neural Networks (DCNNs). Nevertheless, all of the above methods use either fully connected layers or 2D convolutions on parametric unwrapped UV spaces leading to large networks with many parameters. In this paper, we present the first, to the best of our knowledge, non-linear 3DMMs by learning joint texture and shape auto-encoders using direct mesh convolutions. We demonstrate how these auto-encoders can be used to train very light-weight models that perform Coloured Mesh Decoding (CMD) in-the-wild at a speed of over 2500 FPS.
Tasks
Published	2019-04-06
URL	http://arxiv.org/abs/1904.03525v1
PDF	http://arxiv.org/pdf/1904.03525v1.pdf
PWC	https://paperswithcode.com/paper/dense-3d-face-decoding-over-2500fps-joint
Repo
Framework

Demystifying Multi-Faceted Video Summarization: Tradeoff Between Diversity,Representation, Coverage and Importance


Title	Demystifying Multi-Faceted Video Summarization: Tradeoff Between Diversity,Representation, Coverage and Importance
Authors	Vishal Kaushal, Rishabh Iyer, Khoshrav Doctor, Anurag Sahoo, Pratik Dubal, Suraj Kothawade, Rohan Mahadev, Kunal Dargan, Ganesh Ramakrishnan
Abstract	This paper addresses automatic summarization of videos in a unified manner. In particular, we propose a framework for multi-faceted summarization for extractive, query base and entity summarization (summarization at the level of entities like objects, scenes, humans and faces in the video). We investigate several summarization models which capture notions of diversity, coverage, representation and importance, and argue the utility of these different models depending on the application. While most of the prior work on submodular summarization approaches has focused oncombining several models and learning weighted mixtures, we focus on the explainability of different models and featurizations, and how they apply to different domains. We also provide implementation details on summarization systems and the different modalities involved. We hope that the study from this paper will give insights into practitioners to appropriately choose the right summarization models for the problems at hand.
Tasks	Video Summarization
Published	2019-01-03
URL	http://arxiv.org/abs/1901.01153v1
PDF	http://arxiv.org/pdf/1901.01153v1.pdf
PWC	https://paperswithcode.com/paper/demystifying-multi-faceted-video
Repo
Framework

Learning More From Less: Towards Strengthening Weak Supervision for Ad-Hoc Retrieval


Title	Learning More From Less: Towards Strengthening Weak Supervision for Ad-Hoc Retrieval
Authors	Dany Haddad, Joydeep Ghosh
Abstract	The limited availability of ground truth relevance labels has been a major impediment to the application of supervised methods to ad-hoc retrieval. As a result, unsupervised scoring methods, such as BM25, remain strong competitors to deep learning techniques which have brought on dramatic improvements in other domains, such as computer vision and natural language processing. Recent works have shown that it is possible to take advantage of the performance of these unsupervised methods to generate training data for learning-to-rank models. The key limitation to this line of work is the size of the training set required to surpass the performance of the original unsupervised method, which can be as large as $10^{13}$ training examples. Building on these insights, we propose two methods to reduce the amount of training data required. The first method takes inspiration from crowdsourcing, and leverages multiple unsupervised rankers to generate soft, or noise-aware, training labels. The second identifies harmful, or mislabeled, training examples and removes them from the training set. We show that our methods allow us to surpass the performance of the unsupervised baseline with far fewer training examples than previous works.
Tasks	Learning-To-Rank
Published	2019-07-19
URL	https://arxiv.org/abs/1907.08657v1
PDF	https://arxiv.org/pdf/1907.08657v1.pdf
PWC	https://paperswithcode.com/paper/learning-more-from-less-towards-strengthening
Repo
Framework

Unbiased Learning to Rank: Counterfactual and Online Approaches


Title	Unbiased Learning to Rank: Counterfactual and Online Approaches
Authors	Harrie Oosterhuis, Rolf Jagerman, Maarten de Rijke
Abstract	This tutorial covers and contrasts the two main methodologies in unbiased Learning to Rank (LTR): Counterfactual LTR and Online LTR. There has long been an interest in LTR from user interactions, however, this form of implicit feedback is very biased. In recent years, unbiased LTR methods have been introduced to remove the effect of different types of bias caused by user-behavior in search. For instance, a well addressed type of bias is position bias: the rank at which a document is displayed heavily affects the interactions it receives. Counterfactual LTR methods deal with such types of bias by learning from historical interactions while correcting for the effect of the explicitly modelled biases. Online LTR does not use an explicit user model, in contrast, it learns through an interactive process where randomized results are displayed to the user. Through randomization the effect of different types of bias can be removed from the learning process. Though both methodologies lead to unbiased LTR, their approaches differ considerably, furthermore, so do their theoretical guarantees, empirical results, effects on the user experience during learning, and applicability. Consequently, for practitioners the choice between the two is very substantial. By providing an overview of both approaches and contrasting them, we aim to provide an essential guide to unbiased LTR so as to aid in understanding and choosing between methodologies.
Tasks	Learning-To-Rank
Published	2019-07-16
URL	https://arxiv.org/abs/1907.07260v1
PDF	https://arxiv.org/pdf/1907.07260v1.pdf
PWC	https://paperswithcode.com/paper/unbiased-learning-to-rank-counterfactual-and
Repo
Framework

Generative adversarial network for segmentation of motion affected neonatal brain MRI


Title	Generative adversarial network for segmentation of motion affected neonatal brain MRI
Authors	N. Khalili, E. Turk, M. Zreik, M. A. Viergever, M. J. N. L. Benders, I. Isgum
Abstract	Automatic neonatal brain tissue segmentation in preterm born infants is a prerequisite for evaluation of brain development. However, automatic segmentation is often hampered by motion artifacts caused by infant head movements during image acquisition. Methods have been developed to remove or minimize these artifacts during image reconstruction using frequency domain data. However, frequency domain data might not always be available. Hence, in this study we propose a method for removing motion artifacts from the already reconstructed MR scans. The method employs a generative adversarial network trained with a cycle consistency loss to transform slices affected by motion into slices without motion artifacts, and vice versa. In the experiments 40 T2-weighted coronal MR scans of preterm born infants imaged at 30 weeks postmenstrual age were used. All images contained slices affected by motion artifacts hampering automatic tissue segmentation. To evaluate whether correction allows more accurate image segmentation, the images were segmented into 8 tissue classes: cerebellum, myelinated white matter, basal ganglia and thalami, ventricular cerebrospinal fluid, white matter, brain stem, cortical gray matter, and extracerebral cerebrospinal fluid. Images corrected for motion and corresponding segmentations were qualitatively evaluated using 5-point Likert scale. Before the correction of motion artifacts, median image quality and quality of corresponding automatic segmentations were assigned grade 2 (poor) and 3 (moderate), respectively. After correction of motion artifacts, both improved to grades 3 and 4, respectively. The results indicate that correction of motion artifacts in the image space using the proposed approach allows accurate segmentation of brain tissue classes in slices affected by motion artifacts.
Tasks	Image Reconstruction, Semantic Segmentation
Published	2019-06-11
URL	https://arxiv.org/abs/1906.04704v1
PDF	https://arxiv.org/pdf/1906.04704v1.pdf
PWC	https://paperswithcode.com/paper/generative-adversarial-network-for-1
Repo
Framework