Paper Group ANR 960
Glottal Source Processing: from Analysis to Applications. Attention Interpretability Across NLP Tasks. NODE: Extreme Low Light Raw Image Denoising using a Noise Decomposition Network. Video Summarization via Actionness Ranking. Widely Linear Kernels for Complex-Valued Kernel Activation Functions. Improving Human Text Comprehension through Semi-Mark …
Glottal Source Processing: from Analysis to Applications
Title | Glottal Source Processing: from Analysis to Applications |
Authors | Thomas Drugman, Paavo Alku, Abeer Alwan, Bayya Yegnanarayana |
Abstract | The great majority of current voice technology applications relies on acoustic features characterizing the vocal tract response, such as the widely used MFCC of LPC parameters. Nonetheless, the airflow passing through the vocal folds, and called glottal flow, is expected to exhibit a relevant complementarity. Unfortunately, glottal analysis from speech recordings requires specific and more complex processing operations, which explains why it has been generally avoided. This review gives a general overview of techniques which have been designed for glottal source processing. Starting from fundamental analysis tools of pitch tracking, glottal closure instant detection, glottal flow estimation and modelling, this paper then highlights how these solutions can be properly integrated within various voice technology applications. |
Tasks | |
Published | 2019-12-29 |
URL | https://arxiv.org/abs/1912.12604v1 |
https://arxiv.org/pdf/1912.12604v1.pdf | |
PWC | https://paperswithcode.com/paper/glottal-source-processing-from-analysis-to |
Repo | |
Framework | |
Attention Interpretability Across NLP Tasks
Title | Attention Interpretability Across NLP Tasks |
Authors | Shikhar Vashishth, Shyam Upadhyay, Gaurav Singh Tomar, Manaal Faruqui |
Abstract | The attention layer in a neural network model provides insights into the model’s reasoning behind its prediction, which are usually criticized for being opaque. Recently, seemingly contradictory viewpoints have emerged about the interpretability of attention weights (Jain & Wallace, 2019; Vig & Belinkov, 2019). Amid such confusion arises the need to understand attention mechanism more systematically. In this work, we attempt to fill this gap by giving a comprehensive explanation which justifies both kinds of observations (i.e., when is attention interpretable and when it is not). Through a series of experiments on diverse NLP tasks, we validate our observations and reinforce our claim of interpretability of attention through manual evaluation. |
Tasks | |
Published | 2019-09-24 |
URL | https://arxiv.org/abs/1909.11218v1 |
https://arxiv.org/pdf/1909.11218v1.pdf | |
PWC | https://paperswithcode.com/paper/attention-interpretability-across-nlp-tasks |
Repo | |
Framework | |
NODE: Extreme Low Light Raw Image Denoising using a Noise Decomposition Network
Title | NODE: Extreme Low Light Raw Image Denoising using a Noise Decomposition Network |
Authors | Hao Guan, Liu Liu, Sean Moran, Fenglong Song, Gregory Slabaugh |
Abstract | Denoising extreme low light images is a challenging task due to the high noise level. When the illumination is low, digital cameras increase the ISO (electronic gain) to amplify the brightness of captured data. However, this in turn amplifies the noise, arising from read, shot, and defective pixel sources. In the raw domain, read and shot noise are effectively modelled using Gaussian and Poisson distributions respectively, whereas defective pixels can be modeled with impulsive noise. In extreme low light imaging, noise removal becomes a critical challenge to produce a high quality, detailed image with low noise. In this paper, we propose a multi-task deep neural network called Noise Decomposition (NODE) that explicitly and separately estimates defective pixel noise, in conjunction with Gaussian and Poisson noise, to denoise an extreme low light image. Our network is purposely designed to work with raw data, for which the noise is more easily modeled before going through non-linear transformations in the image signal processing (ISP) pipeline. Quantitative and qualitative evaluation show the proposed method to be more effective at denoising real raw images than state-of-the-art techniques. |
Tasks | Denoising, Image Denoising |
Published | 2019-09-11 |
URL | https://arxiv.org/abs/1909.05249v1 |
https://arxiv.org/pdf/1909.05249v1.pdf | |
PWC | https://paperswithcode.com/paper/node-extreme-low-light-raw-image-denoising |
Repo | |
Framework | |
Video Summarization via Actionness Ranking
Title | Video Summarization via Actionness Ranking |
Authors | Mohamed Elfeki, Ali Borji |
Abstract | To automatically produce a brief yet expressive summary of a long video, an automatic algorithm should start by resembling the human process of summary generation. Prior work proposed supervised and unsupervised algorithms to train models for learning the underlying behavior of humans by increasing modeling complexity or craft-designing better heuristics to simulate human summary generation process. In this work, we take a different approach by analyzing a major cue that humans exploit for the summary generation; the nature and intensity of actions. We empirically observed that a frame is more likely to be included in human-generated summaries if it contains a substantial amount of deliberate motion performed by an agent, which is referred to as actionness. Therefore, we hypothesize that learning to automatically generate summaries involves an implicit knowledge of actionness estimation and ranking. We validate our hypothesis by running a user study that explores the correlation between human-generated summaries and actionness ranks. We also run a consensus and behavioral analysis between human subjects to ensure reliable and consistent results. The analysis exhibits a considerable degree of agreement among subjects within obtained data and verifying our initial hypothesis. Based on the study findings, we develop a method to incorporate actionness data to explicitly regulate a learning algorithm that is trained for summary generation. We assess the performance of our approach to four summarization benchmark datasets and demonstrate an evident advantage compared to state-of-the-art summarization methods. |
Tasks | Video Summarization |
Published | 2019-03-01 |
URL | http://arxiv.org/abs/1903.00110v1 |
http://arxiv.org/pdf/1903.00110v1.pdf | |
PWC | https://paperswithcode.com/paper/video-summarization-via-actionness-ranking |
Repo | |
Framework | |
Widely Linear Kernels for Complex-Valued Kernel Activation Functions
Title | Widely Linear Kernels for Complex-Valued Kernel Activation Functions |
Authors | Simone Scardapane, Steven Van Vaerenbergh, Danilo Comminiello, Aurelio Uncini |
Abstract | Complex-valued neural networks (CVNNs) have been shown to be powerful nonlinear approximators when the input data can be properly modeled in the complex domain. One of the major challenges in scaling up CVNNs in practice is the design of complex activation functions. Recently, we proposed a novel framework for learning these activation functions neuron-wise in a data-dependent fashion, based on a cheap one-dimensional kernel expansion and the idea of kernel activation functions (KAFs). In this paper we argue that, despite its flexibility, this framework is still limited in the class of functions that can be modeled in the complex domain. We leverage the idea of widely linear complex kernels to extend the formulation, allowing for a richer expressiveness without an increase in the number of adaptable parameters. We test the resulting model on a set of complex-valued image classification benchmarks. Experimental results show that the resulting CVNNs can achieve higher accuracy while at the same time converging faster. |
Tasks | Image Classification |
Published | 2019-02-06 |
URL | http://arxiv.org/abs/1902.02085v1 |
http://arxiv.org/pdf/1902.02085v1.pdf | |
PWC | https://paperswithcode.com/paper/widely-linear-kernels-for-complex-valued |
Repo | |
Framework | |
Improving Human Text Comprehension through Semi-Markov CRF-based Neural Section Title Generation
Title | Improving Human Text Comprehension through Semi-Markov CRF-based Neural Section Title Generation |
Authors | Sebastian Gehrmann, Steven Layne, Franck Dernoncourt |
Abstract | Titles of short sections within long documents support readers by guiding their focus towards relevant passages and by providing anchor-points that help to understand the progression of the document. The positive effects of section titles are even more pronounced when measured on readers with less developed reading abilities, for example in communities with limited labeled text resources. We, therefore, aim to develop techniques to generate section titles in low-resource environments. In particular, we present an extractive pipeline for section title generation by first selecting the most salient sentence and then applying deletion-based compression. Our compression approach is based on a Semi-Markov Conditional Random Field that leverages unsupervised word-representations such as ELMo or BERT, eliminating the need for a complex encoder-decoder architecture. The results show that this approach leads to competitive performance with sequence-to-sequence models with high resources, while strongly outperforming it with low resources. In a human-subject study across subjects with varying reading abilities, we find that our section titles improve the speed of completing comprehension tasks while retaining similar accuracy. |
Tasks | Reading Comprehension |
Published | 2019-04-15 |
URL | http://arxiv.org/abs/1904.07142v1 |
http://arxiv.org/pdf/1904.07142v1.pdf | |
PWC | https://paperswithcode.com/paper/improving-human-text-comprehension-through |
Repo | |
Framework | |
Estimating Certain Integral Probability Metric (IPM) is as Hard as Estimating under the IPM
Title | Estimating Certain Integral Probability Metric (IPM) is as Hard as Estimating under the IPM |
Authors | Tengyuan Liang |
Abstract | We study the minimax optimal rates for estimating a range of Integral Probability Metrics (IPMs) between two unknown probability measures, based on $n$ independent samples from them. Curiously, we show that estimating the IPM itself between probability measures, is not significantly easier than estimating the probability measures under the IPM. We prove that the minimax optimal rates for these two problems are multiplicatively equivalent, up to a $\log \log (n)/\log (n)$ factor. |
Tasks | |
Published | 2019-11-02 |
URL | https://arxiv.org/abs/1911.00730v1 |
https://arxiv.org/pdf/1911.00730v1.pdf | |
PWC | https://paperswithcode.com/paper/estimating-certain-integral-probability |
Repo | |
Framework | |
2019 Evolutionary Algorithms Review
Title | 2019 Evolutionary Algorithms Review |
Authors | Andrew N. Sloss, Steven Gustafson |
Abstract | Evolutionary algorithm research and applications began over 50 years ago. Like other artificial intelligence techniques, evolutionary algorithms will likely see increased use and development due to the increased availability of computation, more robust and available open source software libraries, and the increasing demand for artificial intelligence techniques. As these techniques become more adopted and capable, it is the right time to take a perspective of their ability to integrate into society and the human processes they intend to augment. In this review, we explore a new taxonomy of evolutionary algorithms and resulting classifications that look at five main areas: the ability to manage the control of the environment with limiters, the ability to explain and repeat the search process, the ability to understand input and output causality within a solution, the ability to manage algorithm bias due to data or user design, and lastly, the ability to add corrective measures. These areas are motivated by today’s pressures on industry to conform to both societies concerns and new government regulatory rules. As many reviews of evolutionary algorithms exist, after motivating this new taxonomy, we briefly classify a broad range of algorithms and identify areas of future research. |
Tasks | |
Published | 2019-06-03 |
URL | https://arxiv.org/abs/1906.08870v1 |
https://arxiv.org/pdf/1906.08870v1.pdf | |
PWC | https://paperswithcode.com/paper/2019-evolutionary-algorithms-review |
Repo | |
Framework | |
Calibrating Wayfinding Decisions in Pedestrian Simulation Models: The Entropy Map
Title | Calibrating Wayfinding Decisions in Pedestrian Simulation Models: The Entropy Map |
Authors | Luca Crociani, Giuseppe Vizzari, Stefania Bandini |
Abstract | This paper presents entropy maps, an approach to describing and visualising uncertainty among alternative potential movement intentions in pedestrian simulation models. In particular, entropy maps show the instantaneous level of randomness in decisions of a pedestrian agent situated in a specific point of the simulated environment with an heatmap approach. Experimental results highlighting the relevance of this tool supporting modelers are provided and discussed. |
Tasks | |
Published | 2019-09-06 |
URL | https://arxiv.org/abs/1909.03054v1 |
https://arxiv.org/pdf/1909.03054v1.pdf | |
PWC | https://paperswithcode.com/paper/calibrating-wayfinding-decisions-in |
Repo | |
Framework | |
Spatial-Temporal Self-Attention Network for Flow Prediction
Title | Spatial-Temporal Self-Attention Network for Flow Prediction |
Authors | Haoxing Lin, Weijia Jia, Yiping Sun, Yongjian You |
Abstract | Flow prediction (e.g., crowd flow, traffic flow) with features of spatial-temporal is increasingly investigated in AI research field. It is very challenging due to the complicated spatial dependencies between different locations and dynamic temporal dependencies among different time intervals. Although measurements of both dependencies are employed, existing methods suffer from the following two problems. First, the temporal dependencies are measured either uniformly or bias against long-term dependencies, which overlooks the distinctive impacts of short-term and long-term temporal dependencies. Second, the existing methods capture spatial and temporal dependencies independently, which wrongly assumes that the correlations between these dependencies are weak and ignores the complicated mutual influences between them. To address these issues, we propose a Spatial-Temporal Self-Attention Network (ST-SAN). As the path-length of attending long-term dependency is shorter in the self-attention mechanism, the vanishing of long-term temporal dependencies is prevented. In addition, since our model relies solely on attention mechanisms, the spatial and temporal dependencies can be simultaneously measured. Experimental results on real-world data demonstrate that, in comparison with state-of-the-art methods, our model reduces the root mean square errors by 9% in inflow prediction and 4% in outflow prediction on Taxi-NYC data, which is very significant compared to the previous improvement. |
Tasks | |
Published | 2019-12-13 |
URL | https://arxiv.org/abs/1912.07663v2 |
https://arxiv.org/pdf/1912.07663v2.pdf | |
PWC | https://paperswithcode.com/paper/spatial-temporal-self-attention-network-for |
Repo | |
Framework | |
Automatic Playtesting for Game Parameter Tuning via Active Learning
Title | Automatic Playtesting for Game Parameter Tuning via Active Learning |
Authors | Alexander Zook, Eric Fruchter, Mark O. Riedl |
Abstract | Game designers use human playtesting to gather feedback about game design elements when iteratively improving a game. Playtesting, however, is expensive: human testers must be recruited, playtest results must be aggregated and interpreted, and changes to game designs must be extrapolated from these results. Can automated methods reduce this expense? We show how active learning techniques can formalize and automate a subset of playtesting goals. Specifically, we focus on the low-level parameter tuning required to balance a game once the mechanics have been chosen. Through a case study on a shoot-`em-up game we demonstrate the efficacy of active learning to reduce the amount of playtesting needed to choose the optimal set of game parameters for two classes of (formal) design objectives. This work opens the potential for additional methods to reduce the human burden of performing playtesting for a variety of relevant design concerns. | |
Tasks | Active Learning |
Published | 2019-08-04 |
URL | https://arxiv.org/abs/1908.01417v1 |
https://arxiv.org/pdf/1908.01417v1.pdf | |
PWC | https://paperswithcode.com/paper/automatic-playtesting-for-game-parameter |
Repo | |
Framework | |
What makes a good conversation? How controllable attributes affect human judgments
Title | What makes a good conversation? How controllable attributes affect human judgments |
Authors | Abigail See, Stephen Roller, Douwe Kiela, Jason Weston |
Abstract | A good conversation requires balance – between simplicity and detail; staying on topic and changing it; asking questions and answering them. Although dialogue agents are commonly evaluated via human judgments of overall quality, the relationship between quality and these individual factors is less well-studied. In this work, we examine two controllable neural text generation methods, conditional training and weighted decoding, in order to control four important attributes for chitchat dialogue: repetition, specificity, response-relatedness and question-asking. We conduct a large-scale human evaluation to measure the effect of these control parameters on multi-turn interactive conversations on the PersonaChat task. We provide a detailed analysis of their relationship to high-level aspects of conversation, and show that by controlling combinations of these variables our models obtain clear improvements in human quality judgments. |
Tasks | Text Generation |
Published | 2019-02-22 |
URL | http://arxiv.org/abs/1902.08654v2 |
http://arxiv.org/pdf/1902.08654v2.pdf | |
PWC | https://paperswithcode.com/paper/what-makes-a-good-conversation-how |
Repo | |
Framework | |
Graph Input Representations for Machine Learning Applications in Urban Network Analysis
Title | Graph Input Representations for Machine Learning Applications in Urban Network Analysis |
Authors | Alessio Pagani, Abhinav Mehrotra, Mirco Musolesi |
Abstract | Understanding and learning the characteristics of network paths has been of particular interest for decades and has led to several successful applications. Such analysis becomes challenging for urban networks as their size and complexity are significantly higher compared to other networks. The state-of-the-art machine learning (ML) techniques allow us to detect hidden patterns and, thus, infer the features associated with them. However, very little is known about the impact on the performance of such predictive models by the use of different input representations. In this paper, we design and evaluate six different graph input representations (i.e., representations of the network paths), by considering the network’s topological and temporal characteristics, for being used as inputs for machine learning models to learn the behavior of urban networks paths. The representations are validated and then tested with a real-world taxi journeys dataset predicting the tips using a road network of New York. Our results demonstrate that the input representations that use temporal information help the model to achieve the highest accuracy (RMSE of 1.42$). |
Tasks | |
Published | 2019-12-11 |
URL | https://arxiv.org/abs/1912.07662v1 |
https://arxiv.org/pdf/1912.07662v1.pdf | |
PWC | https://paperswithcode.com/paper/graph-input-representations-for-machine |
Repo | |
Framework | |
Continual Learning for Infinite Hierarchical Change-Point Detection
Title | Continual Learning for Infinite Hierarchical Change-Point Detection |
Authors | Pablo Moreno-Muñoz, David Ramírez, Antonio Artés-Rodríguez |
Abstract | Change-point detection (CPD) aims to locate abrupt transitions in the generative model of a sequence of observations. When Bayesian methods are considered, the standard practice is to infer the posterior distribution of the change-point locations. However, for complex models (high-dimensional or heterogeneous), it is not possible to perform reliable detection. To circumvent this problem, we propose to use a hierarchical model, which yields observations that belong to a lower-dimensional manifold. Concretely, we consider a latent-class model with an unbounded number of categories, which is based on the chinese-restaurant process (CRP). For this model we derive a continual learning mechanism that is based on the sequential construction of the CRP and the expectation-maximization (EM) algorithm with a stochastic maximization step. Our results show that the proposed method is able to recursively infer the number of underlying latent classes and perform CPD in a reliable manner. |
Tasks | Change Point Detection, Continual Learning |
Published | 2019-10-22 |
URL | https://arxiv.org/abs/1910.10087v1 |
https://arxiv.org/pdf/1910.10087v1.pdf | |
PWC | https://paperswithcode.com/paper/continual-learning-for-infinite-hierarchical |
Repo | |
Framework | |
Privacy-Preserving Public Release of Datasets for Support Vector Machine Classification
Title | Privacy-Preserving Public Release of Datasets for Support Vector Machine Classification |
Authors | Farhad Farokhi |
Abstract | We consider the problem of publicly releasing a dataset for support vector machine classification while not infringing on the privacy of data subjects (i.e., individuals whose private information is stored in the dataset). The dataset is systematically obfuscated using an additive noise for privacy protection. Motivated by the Cramer-Rao bound, inverse of the trace of the Fisher information matrix is used as a measure of the privacy. Conditions are established for ensuring that the classifier extracted from the original dataset and the obfuscated one are close to each other (capturing the utility). The optimal noise distribution is determined by maximizing a weighted sum of the measures of privacy and utility. The optimal privacy-preserving noise is proved to achieve local differential privacy. The results are generalized to a broader class of optimization-based supervised machine learning algorithms. Applicability of the methodology is demonstrated on multiple datasets. |
Tasks | |
Published | 2019-12-29 |
URL | https://arxiv.org/abs/1912.12576v1 |
https://arxiv.org/pdf/1912.12576v1.pdf | |
PWC | https://paperswithcode.com/paper/privacy-preserving-public-release-of-datasets |
Repo | |
Framework | |