April 2, 2020

2909 words 14 mins read

Paper Group ANR 117

Paper Group ANR 117

Challenges in Supporting Exploratory Search through Voice Assistants. Interpretability of Blackbox Machine Learning Models through Dataview Extraction and Shadow Model creation. Unsupervised Pre-trained, Texture Aware And Lightweight Model for Deep Learning-Based Iris Recognition Under Limited Annotated Data. Deep Learning for Automatic Tracking of …

Challenges in Supporting Exploratory Search through Voice Assistants

Title Challenges in Supporting Exploratory Search through Voice Assistants
Authors Xiao Ma, Ariel Liu
Abstract Voice assistants have been successfully adopted for simple, routine tasks, such as asking for the weather or setting an alarm. However, as people get more familiar with voice assistants, they may increase their expectations for more complex tasks, such as exploratory search– e.g., “What should I do when I visit Paris with kids? Oh, and ideally not too expensive.” Compared to simple search tasks such as “How tall is the Eiffel Tower?", which can be answered with a single-shot answer, the response to exploratory search is more nuanced, especially through voice-based assistants. In this paper, we outline four challenges in designing voice assistants that can better support exploratory search: addressing situationally induced impairments; working with mixed-modal interactions; designing for diverse populations; and meeting users’ expectations and gaining their trust. Addressing these challenges is important for developing more “intelligent” voice-based personal assistants.
Published 2020-03-06
URL https://arxiv.org/abs/2003.02986v1
PDF https://arxiv.org/pdf/2003.02986v1.pdf
PWC https://paperswithcode.com/paper/challenges-in-supporting-exploratory-search

Interpretability of Blackbox Machine Learning Models through Dataview Extraction and Shadow Model creation

Title Interpretability of Blackbox Machine Learning Models through Dataview Extraction and Shadow Model creation
Authors Rupam Patir, Shubham Singhal, C. Anantaram, Vikram Goyal
Abstract Deep learning models trained using massive amounts of data tend to capture one view of the data and its associated mapping. Different deep learning models built on the same training data may capture different views of the data based on the underlying techniques used. For explaining the decisions arrived by blackbox deep learning models, we argue that it is essential to reproduce that model’s view of the training data faithfully. This faithful reproduction can then be used for explanation generation. We investigate two methods for data view extraction: hill-climbing approach and a GAN-driven approach. We then use this synthesized data for creating shadow models for explanation generation: Decision-Tree model and Formal Concept Analysis based model. We evaluate these approaches on a Blackbox model trained on public datasets and show its usefulness in explanation generation.
Published 2020-02-02
URL https://arxiv.org/abs/2002.00372v1
PDF https://arxiv.org/pdf/2002.00372v1.pdf
PWC https://paperswithcode.com/paper/interpretability-of-blackbox-machine-learning

Unsupervised Pre-trained, Texture Aware And Lightweight Model for Deep Learning-Based Iris Recognition Under Limited Annotated Data

Title Unsupervised Pre-trained, Texture Aware And Lightweight Model for Deep Learning-Based Iris Recognition Under Limited Annotated Data
Authors Manashi Chakraborty, Mayukh Roy, Prabir Kumar Biswas, Pabitra Mitra
Abstract In this paper, we present a texture aware lightweight deep learning framework for iris recognition. Our contributions are primarily three fold. Firstly, to address the dearth of labelled iris data, we propose a reconstruction loss guided unsupervised pre-training stage followed by supervised refinement. This drives the network weights to focus on discriminative iris texture patterns. Next, we propose several texture aware improvisations inside a Convolution Neural Net to better leverage iris textures. Finally, we show that our systematic training and architectural choices enable us to design an efficient framework with upto 100X fewer parameters than contemporary deep learning baselines yet achieve better recognition performance for within and cross dataset evaluations.
Tasks Iris Recognition
Published 2020-02-20
URL https://arxiv.org/abs/2002.09048v1
PDF https://arxiv.org/pdf/2002.09048v1.pdf
PWC https://paperswithcode.com/paper/unsupervised-pre-trained-texture-aware-and

Deep Learning for Automatic Tracking of Tongue Surface in Real-time Ultrasound Videos, Landmarks instead of Contours

Title Deep Learning for Automatic Tracking of Tongue Surface in Real-time Ultrasound Videos, Landmarks instead of Contours
Authors M. Hamed Mozaffari, Won-Sook Lee
Abstract One usage of medical ultrasound imaging is to visualize and characterize human tongue shape and motion during a real-time speech to study healthy or impaired speech production. Due to the low-contrast characteristic and noisy nature of ultrasound images, it might require expertise for non-expert users to recognize tongue gestures in applications such as visual training of a second language. Moreover, quantitative analysis of tongue motion needs the tongue dorsum contour to be extracted, tracked, and visualized. Manual tongue contour extraction is a cumbersome, subjective, and error-prone task. Furthermore, it is not a feasible solution for real-time applications. The growth of deep learning has been vigorously exploited in various computer vision tasks, including ultrasound tongue contour tracking. In the current methods, the process of tongue contour extraction comprises two steps of image segmentation and post-processing. This paper presents a new novel approach of automatic and real-time tongue contour tracking using deep neural networks. In the proposed method, instead of the two-step procedure, landmarks of the tongue surface are tracked. This novel idea enables researchers in this filed to benefits from available previously annotated databases to achieve high accuracy results. Our experiment disclosed the outstanding performances of the proposed technique in terms of generalization, performance, and accuracy.
Tasks Semantic Segmentation
Published 2020-03-16
URL https://arxiv.org/abs/2003.08808v1
PDF https://arxiv.org/pdf/2003.08808v1.pdf
PWC https://paperswithcode.com/paper/deep-learning-for-automatic-tracking-of

MBGD-RDA Training and Rule Pruning for Concise TSK Fuzzy Regression Models

Title MBGD-RDA Training and Rule Pruning for Concise TSK Fuzzy Regression Models
Authors Dongrui Wu
Abstract To effectively train Takagi-Sugeno-Kang (TSK) fuzzy systems for regression problems, a Mini-Batch Gradient Descent with Regularization, DropRule, and AdaBound (MBGD-RDA) algorithm was recently proposed. It has demonstrated superior performances; however, there are also some limitations, e.g., it does not allow the user to specify the number of rules directly, and only Gaussian MFs can be used. This paper proposes two variants of MBGD-RDA to remedy these limitations, and show that they outperform the original MBGD-RDA and the classical ANFIS algorithms with the same number of rules. Furthermore, we also propose a rule pruning algorithm for TSK fuzzy systems, which can reduce the number of rules without significantly sacrificing the regression performance. Experiments showed that the rules obtained from pruning are generally better than training them from scratch directly, especially when Gaussian MFs are used.
Published 2020-03-01
URL https://arxiv.org/abs/2003.00608v2
PDF https://arxiv.org/pdf/2003.00608v2.pdf
PWC https://paperswithcode.com/paper/mbgd-rda-training-and-similarity-based-rule

Dimension Independent Generalization Error with Regularized Online Optimization

Title Dimension Independent Generalization Error with Regularized Online Optimization
Authors Xi Chen, Qiang Liu, Xin T. Tong
Abstract One classical canon of statistics is that large models are prone to overfitting and model selection procedures are necessary for high-dimensional data. However, many overparameterized models such as neural networks, which are often trained with simple online methods and regularization, perform very well in practice. The empirical success of overparameterized models, which is often known as benign overfitting, motivates us to have a new look at the statistical generalization theory for online optimization. In particular, we present a general theory on the generalization error of stochastic gradient descent (SGD) for both convex and non-convex loss functions. We further provide the definition of “low effective dimension” so that the generalization error either does not depend on the ambient dimension $p$ or depends on $p$ via a poly-logarithmic factor. We also demonstrate on several widely used statistical models that the “low effect dimension” arises naturally in overparameterized settings. The studied statistical applications include both convex models such as linear regression and logistic regression, and non-convex models such as $M$-estimator and two-layer neural networks.
Tasks Model Selection
Published 2020-03-25
URL https://arxiv.org/abs/2003.11196v1
PDF https://arxiv.org/pdf/2003.11196v1.pdf
PWC https://paperswithcode.com/paper/dimension-independent-generalization-error

A study on the role of subsidiary information in replay attack spoofing detection

Title A study on the role of subsidiary information in replay attack spoofing detection
Authors Jee-weon Jung, Hye-jin Shim, Hee-Soo Heo, Ha-Jin Yu
Abstract In this study, we analyze the role of various categories of subsidiary information in conducting replay attack spoofing detection: Room Size', Reverberation’, Speaker-to-ASV distance, Attacker-to-Speaker distance’, and `Replay Device Quality’. As a means of analyzing subsidiary information, we use two frameworks to either subtract or include a category of subsidiary information to the code extracted from a deep neural network. For subtraction, we utilize an adversarial process framework which makes the code orthogonal to the basis vectors of the subsidiary information. For addition, we utilize the multi-task learning framework to include subsidiary information to the code. All experiments are conducted using the ASVspoof 2019 physical access scenario with the provided meta data. Through the analysis of the result of the two approaches, we conclude that various categories of subsidiary information does not reside enough in the code when the deep neural network is trained for binary classification. Explicitly including various categories of subsidiary information through the multi-task learning framework can help improve performance in closed set condition. |
Tasks Multi-Task Learning
Published 2020-01-31
URL https://arxiv.org/abs/2001.11688v1
PDF https://arxiv.org/pdf/2001.11688v1.pdf
PWC https://paperswithcode.com/paper/a-study-on-the-role-of-subsidiary-information

Complete Hierarchy of Relaxation for Constrained Signomial Positivity

Title Complete Hierarchy of Relaxation for Constrained Signomial Positivity
Authors Allen Houze Wang, Priyank Jaini, Yaoliang Yu, Pascal Poupart
Abstract In this article, we prove that the Sums-of-AM/GM Exponential (SAGE) relaxation generalized to signomial over a constrained set is complete, with a compactness assumption. The high-level structure of the proof is as follows. We first apply variable change to convert a set of rational exponents to polynomial equations. In addition, we make the observation that linear constraints of the variables may also be converted to polynomial equations after variable change. Note that any convex set may be expressed as a set of linear constraints. Further, we use redundant constraints to find reduction to Positivstellensatz. We rely on Positivstellensatz results from algebraic geometry to obtain a decomposition of positive polynomials. Lastly, we explicitly show that the decomposition is of a form certifiable by SAGE.
Published 2020-03-08
URL https://arxiv.org/abs/2003.03731v1
PDF https://arxiv.org/pdf/2003.03731v1.pdf
PWC https://paperswithcode.com/paper/complete-hierarchy-of-relaxation-for

Multi Type Mean Field Reinforcement Learning

Title Multi Type Mean Field Reinforcement Learning
Authors Sriram Ganapathi Subramanian, Pascal Poupart, Matthew E. Taylor, Nidhi Hegde
Abstract Mean field theory provides an effective way of scaling multiagent reinforcement learning algorithms to environments with many agents that can be abstracted by a virtual mean agent. In this paper, we extend mean field multiagent algorithms to multiple types. The types enable the relaxation of a core assumption in mean field games, which is that all agents in the environment are playing almost similar strategies and have the same goal. We conduct experiments on three different testbeds for the field of many agent reinforcement learning, based on the standard MAgents framework. We consider two different kinds of mean field games: a) Games where agents belong to predefined types that are known a priori and b) Games where the type of each agent is unknown and therefore must be learned based on observations. We introduce new algorithms for each type of game and demonstrate their superior performance over state of the art algorithms that assume that all agents belong to the same type and other baseline algorithms in the MAgent framework.
Published 2020-02-06
URL https://arxiv.org/abs/2002.02513v3
PDF https://arxiv.org/pdf/2002.02513v3.pdf
PWC https://paperswithcode.com/paper/multi-type-mean-field-reinforcement-learning

Phase-based Information for Voice Pathology Detection

Title Phase-based Information for Voice Pathology Detection
Authors Thomas Drugman, Thomas Dubuisson, Thierry Dutoit
Abstract In most current approaches of speech processing, information is extracted from the magnitude spectrum. However recent perceptual studies have underlined the importance of the phase component. The goal of this paper is to investigate the potential of using phase-based features for automatically detecting voice disorders. It is shown that group delay functions are appropriate for characterizing irregularities in the phonation. Besides the respect of the mixed-phase model of speech is discussed. The proposed phase-based features are evaluated and compared to other parameters derived from the magnitude spectrum. Both streams are shown to be interestingly complementary. Furthermore phase-based features turn out to convey a great amount of relevant information, leading to high discrimination performance.
Published 2020-01-02
URL https://arxiv.org/abs/2001.00372v1
PDF https://arxiv.org/pdf/2001.00372v1.pdf
PWC https://paperswithcode.com/paper/phase-based-information-for-voice-pathology

Faster Projection-free Online Learning

Title Faster Projection-free Online Learning
Authors Elad Hazan, Edgar Minasyan
Abstract In many online learning problems the computational bottleneck for gradient-based methods is the projection operation. For this reason, in many problems the most efficient algorithms are based on the Frank-Wolfe method, which replaces projections by linear optimization. In the general case, however, online projection-free methods require more iterations than projection-based methods: the best known regret bound scales as $T^{3/4}$. Despite significant work on various variants of the Frank-Wolfe method, this bound has remained unchanged for a decade. In this paper we give an efficient projection-free algorithm that guarantees $T^{2/3}$ regret for general online convex optimization with smooth cost functions and one linear optimization computation per iteration. As opposed to previous Frank-Wolfe approaches, our algorithm is derived using the Follow-the-Perturbed-Leader method and is analyzed using an online primal-dual framework.
Published 2020-01-30
URL https://arxiv.org/abs/2001.11568v2
PDF https://arxiv.org/pdf/2001.11568v2.pdf
PWC https://paperswithcode.com/paper/faster-projection-free-online-learning

Competence Assessment as an Expert System for Human Resource Management: A Mathematical Approach

Title Competence Assessment as an Expert System for Human Resource Management: A Mathematical Approach
Authors Mahdi Bohlouli, Nikolaos Mittas, George Kakarontzas, Theodosios Theodosiou, Lefteris Angelis, Madjid Fathi
Abstract Efficient human resource management needs accurate assessment and representation of available competences as well as effective mapping of required competences for specific jobs and positions. In this regard, appropriate definition and identification of competence gaps express differences between acquired and required competences. Using a detailed quantification scheme together with a mathematical approach is a way to support accurate competence analytics, which can be applied in a wide variety of sectors and fields. This article describes the combined use of software technologies and mathematical and statistical methods for assessing and analyzing competences in human resource information systems. Based on a standard competence model, which is called a Professional, Innovative and Social competence tree, the proposed framework offers flexible tools to experts in real enterprise environments, either for evaluation of employees towards an optimal job assignment and vocational training or for recruitment processes. The system has been tested with real human resource data sets in the frame of the European project called ComProFITS.
Published 2020-01-16
URL https://arxiv.org/abs/2001.09797v1
PDF https://arxiv.org/pdf/2001.09797v1.pdf
PWC https://paperswithcode.com/paper/competence-assessment-as-an-expert-system-for

Joint Face Completion and Super-resolution using Multi-scale Feature Relation Learning

Title Joint Face Completion and Super-resolution using Multi-scale Feature Relation Learning
Authors Zhilei Liu, Yunpeng Wu, Le Li, Cuicui Zhang, Baoyuan Wu
Abstract Previous research on face restoration often focused on repairing a specific type of low-quality facial images such as low-resolution (LR) or occluded facial images. However, in the real world, both the above-mentioned forms of image degradation often coexist. Therefore, it is important to design a model that can repair LR occluded images simultaneously. This paper proposes a multi-scale feature graph generative adversarial network (MFG-GAN) to implement the face restoration of images in which both degradation modes coexist, and also to repair images with a single type of degradation. Based on the GAN, the MFG-GAN integrates the graph convolution and feature pyramid network to restore occluded low-resolution face images to non-occluded high-resolution face images. The MFG-GAN uses a set of customized losses to ensure that high-quality images are generated. In addition, we designed the network in an end-to-end format. Experimental results on the public-domain CelebA and Helen databases show that the proposed approach outperforms state-of-the-art methods in performing face super-resolution (up to 4x or 8x) and face completion simultaneously. Cross-database testing also revealed that the proposed approach has good generalizability.
Tasks Facial Inpainting, Super-Resolution
Published 2020-02-29
URL https://arxiv.org/abs/2003.00255v1
PDF https://arxiv.org/pdf/2003.00255v1.pdf
PWC https://paperswithcode.com/paper/joint-face-completion-and-super-resolution

Dialogue history integration into end-to-end signal-to-concept spoken language understanding systems

Title Dialogue history integration into end-to-end signal-to-concept spoken language understanding systems
Authors Natalia Tomashenko, Christian Raymond, Antoine Caubriere, Renato De Mori, Yannick Esteve
Abstract This work investigates the embeddings for representing dialog history in spoken language understanding (SLU) systems. We focus on the scenario when the semantic information is extracted directly from the speech signal by means of a single end-to-end neural network model. We proposed to integrate dialogue history into an end-to-end signal-to-concept SLU system. The dialog history is represented in the form of dialog history embedding vectors (so-called h-vectors) and is provided as an additional information to end-to-end SLU models in order to improve the system performance. Three following types of h-vectors are proposed and experimentally evaluated in this paper: (1) supervised-all embeddings predicting bag-of-concepts expected in the answer of the user from the last dialog system response; (2) supervised-freq embeddings focusing on predicting only a selected set of semantic concept (corresponding to the most frequent errors in our experiments); and (3) unsupervised embeddings. Experiments on the MEDIA corpus for the semantic slot filling task demonstrate that the proposed h-vectors improve the model performance.
Tasks Slot Filling, Spoken Language Understanding
Published 2020-02-14
URL https://arxiv.org/abs/2002.06012v1
PDF https://arxiv.org/pdf/2002.06012v1.pdf
PWC https://paperswithcode.com/paper/dialogue-history-integration-into-end-to-end

A Probabilistic Simulator of Spatial Demand for Product Allocation

Title A Probabilistic Simulator of Spatial Demand for Product Allocation
Authors Porter Jenkins, Hua Wei, J. Stockton Jenkins, Zhenhui Li
Abstract Connecting consumers with relevant products is a very important problem in both online and offline commerce. In physical retail, product placement is an effective way to connect consumers with products. However, selecting product locations within a store can be a tedious process. Moreover, learning important spatial patterns in offline retail is challenging due to the scarcity of data and the high cost of exploration and experimentation in the physical world. To address these challenges, we propose a stochastic model of spatial demand in physical retail. We show that the proposed model is more predictive of demand than existing baselines. We also perform a preliminary study into different automation techniques and show that an optimal product allocation policy can be learned through Deep Q-Learning.
Tasks Q-Learning
Published 2020-01-09
URL https://arxiv.org/abs/2001.03210v1
PDF https://arxiv.org/pdf/2001.03210v1.pdf
PWC https://paperswithcode.com/paper/a-probabilistic-simulator-of-spatial-demand
comments powered by Disqus