January 29, 2020

3375 words 16 mins read

Paper Group ANR 608

Snore-GANs: Improving Automatic Snore Sound Classification with Synthesized Data. Learning Where to See: A Novel Attention Model for Automated Immunohistochemical Scoring. RobustTrend: A Huber Loss with a Combined First and Second Order Difference Regularization for Time Series Trend Filtering. Model Slicing for Supporting Complex Analytics with El …

Snore-GANs: Improving Automatic Snore Sound Classification with Synthesized Data


Title	Snore-GANs: Improving Automatic Snore Sound Classification with Synthesized Data
Authors	Zixing Zhang, Jing Han, Kun Qian, Christoph Janott, Yanan Guo, Bjoern Schuller
Abstract	One of the frontier issues that severely hamper the development of automatic snore sound classification (ASSC) associates to the lack of sufficient supervised training data. To cope with this problem, we propose a novel data augmentation approach based on semi-supervised conditional Generative Adversarial Networks (scGANs), which aims to automatically learn a mapping strategy from a random noise space to original data distribution. The proposed approach has the capability of well synthesizing ‘realistic’ high-dimensional data, while requiring no additional annotation process. To handle the mode collapse problem of GANs, we further introduce an ensemble strategy to enhance the diversity of the generated data. The systematic experiments conducted on a widely used Munich-Passau snore sound corpus demonstrate that the scGANs-based systems can remarkably outperform other classic data augmentation systems, and are also competitive to other recently reported systems for ASSC.
Tasks	Data Augmentation
Published	2019-03-29
URL	http://arxiv.org/abs/1903.12422v1
PDF	http://arxiv.org/pdf/1903.12422v1.pdf
PWC	https://paperswithcode.com/paper/snore-gans-improving-automatic-snore-sound
Repo
Framework

Learning Where to See: A Novel Attention Model for Automated Immunohistochemical Scoring


Title	Learning Where to See: A Novel Attention Model for Automated Immunohistochemical Scoring
Authors	Talha Qaiser, Nasir M. Rajpoot
Abstract	Estimating over-amplification of human epidermal growth factor receptor 2 (HER2) on invasive breast cancer (BC) is regarded as a significant predictive and prognostic marker. We propose a novel deep reinforcement learning (DRL) based model that treats immunohistochemical (IHC) scoring of HER2 as a sequential learning task. For a given image tile sampled from multi-resolution giga-pixel whole slide image (WSI), the model learns to sequentially identify some of the diagnostically relevant regions of interest (ROIs) by following a parameterized policy. The selected ROIs are processed by recurrent and residual convolution networks to learn the discriminative features for different HER2 scores and predict the next location, without requiring to process all the sub-image patches of a given tile for predicting the HER2 score, mimicking the histopathologist who would not usually analyze every part of the slide at the highest magnification. The proposed model incorporates a task-specific regularization term and inhibition of return mechanism to prevent the model from revisiting the previously attended locations. We evaluated our model on two IHC datasets: a publicly available dataset from the HER2 scoring challenge contest and another dataset consisting of WSIs of gastroenteropancreatic neuroendocrine tumor sections stained with Glo1 marker. We demonstrate that the proposed model outperforms other methods based on state-of-the-art deep convolutional networks. To the best of our knowledge, this is the first study using DRL for IHC scoring and could potentially lead to wider use of DRL in the domain of computational pathology reducing the computational burden of the analysis of large multigigapixel histology images.
Tasks
Published	2019-03-26
URL	http://arxiv.org/abs/1903.10762v1
PDF	http://arxiv.org/pdf/1903.10762v1.pdf
PWC	https://paperswithcode.com/paper/learning-where-to-see-a-novel-attention-model
Repo
Framework

RobustTrend: A Huber Loss with a Combined First and Second Order Difference Regularization for Time Series Trend Filtering


Title	RobustTrend: A Huber Loss with a Combined First and Second Order Difference Regularization for Time Series Trend Filtering
Authors	Qingsong Wen, Jingkun Gao, Xiaomin Song, Liang Sun, Jian Tan
Abstract	Extracting the underlying trend signal is a crucial step to facilitate time series analysis like forecasting and anomaly detection. Besides noise signal, time series can contain not only outliers but also abrupt trend changes in real-world scenarios. To deal with these challenges, we propose a robust trend filtering algorithm based on robust statistics and sparse learning. Specifically, we adopt the Huber loss to suppress outliers, and utilize a combination of the first order and second order difference on the trend component as regularization to capture both slow and abrupt trend changes. Furthermore, an efficient method is designed to solve the proposed robust trend filtering based on majorization minimization (MM) and alternative direction method of multipliers (ADMM). We compared our proposed robust trend filter with other nine state-of-the-art trend filtering algorithms on both synthetic and real-world datasets. The experiments demonstrate that our algorithm outperforms existing methods.
Tasks	Anomaly Detection, Sparse Learning, Time Series, Time Series Analysis
Published	2019-06-10
URL	https://arxiv.org/abs/1906.03751v2
PDF	https://arxiv.org/pdf/1906.03751v2.pdf
PWC	https://paperswithcode.com/paper/robusttrend-a-huber-loss-with-a-combined
Repo
Framework

Model Slicing for Supporting Complex Analytics with Elastic Inference Cost and Resource Constraints


Title	Model Slicing for Supporting Complex Analytics with Elastic Inference Cost and Resource Constraints
Authors	Shaofeng Cai, Gang Chen, Beng Chin Ooi, Jinyang Gao
Abstract	Deep learning models have been used to support analytics beyond simple aggregation, where deeper and wider models have been shown to yield great results. These models consume a huge amount of memory and computational operations. However, most of the large-scale industrial applications are often computational budget constrained. In practice, the peak workload of inference service could be 10x higher than the average cases, with the presence of unpredictable extreme cases. Lots of computational resources could be wasted during off-peak hours and the system may crash when the workload exceeds system capacity. How to support deep learning services with dynamic workload cost-efficiently remains a challenging problem. In this paper, we address the challenge with a general and novel training scheme called model slicing, which enables deep learning models to provide predictions within the prescribed computational resource budget dynamically. Model slicing could be viewed as an elastic computation solution without requiring more computational resources. Succinctly, each layer in the model is divided into groups of contiguous block of basic components (i.e. neurons in dense layers and channels in convolutional layers), and then partially ordered relation is introduced to these groups by enforcing that groups participated in each forward pass always starts from the first group to the dynamically-determined rightmost group. Trained by dynamically indexing the rightmost group with a single parameter slice rate, the network is engendered to build up group-wise and residual representation. Then during inference, a sub-model with fewer groups can be readily deployed for efficiency whose computation is roughly quadratic to the width controlled by the slice rate. Extensive experiments show that models trained with model slicing can effectively support on-demand workload with elastic inference cost.
Tasks	Model Compression
Published	2019-04-03
URL	https://arxiv.org/abs/1904.01831v2
PDF	https://arxiv.org/pdf/1904.01831v2.pdf
PWC	https://paperswithcode.com/paper/model-slicing-for-supporting-complex
Repo
Framework

Divide-and-Conquer Adversarial Learning for High-Resolution Image and Video Enhancement


Title	Divide-and-Conquer Adversarial Learning for High-Resolution Image and Video Enhancement
Authors	Zhiwu Huang, Danda Pani Paudel, Guanju Li, Jiqing Wu, Radu Timofte, Luc Van Gool
Abstract	This paper introduces a divide-and-conquer inspired adversarial learning (DACAL) approach for photo enhancement. The key idea is to decompose the photo enhancement process into hierarchically multiple sub-problems, which can be better conquered from bottom to up. On the top level, we propose a perception-based division to learn additive and multiplicative components, required to translate a low-quality image or video into its high-quality counterpart. On the intermediate level, we use a frequency-based division with generative adversarial network (GAN) to weakly supervise the photo enhancement process. On the lower level, we design a dimension-based division that enables the GAN model to better approximates the distribution distance on multiple independent one-dimensional data to train the GAN model. While considering all three hierarchies, we develop multiscale and recurrent training approaches to optimize the image and video enhancement process in a weakly-supervised manner. Both quantitative and qualitative results clearly demonstrate that the proposed DACAL achieves the state-of-the-art performance for high-resolution image and video enhancement.
Tasks
Published	2019-10-23
URL	https://arxiv.org/abs/1910.10455v1
PDF	https://arxiv.org/pdf/1910.10455v1.pdf
PWC	https://paperswithcode.com/paper/divide-and-conquer-adversarial-learning-for-1
Repo
Framework

AppStreamer: Reducing Storage Requirements of Mobile Games through Predictive Streaming


Title	AppStreamer: Reducing Storage Requirements of Mobile Games through Predictive Streaming
Authors	Nawanol Theera-Ampornpunt, Shikhar Suryavansh, Sameer Manchanda, Rajesh Panta, Kaustubh Joshi, Mostafa Ammar, Mung Chiang, Saurabh Bagchi
Abstract	Storage has become a constrained resource on smartphones. Gaming is a popular activity on mobile devices and the explosive growth in the number of games coupled with their growing size contributes to the storage crunch. Even where storage is plentiful, it takes a long time to download and install a heavy app before it can be launched. This paper presents AppStreamer, a novel technique for reducing the storage requirements or startup delay of mobile games, and heavy mobile apps in general. AppStreamer is based on the intuition that most apps do not need the entirety of its files (images, audio and video clips, etc.) at any one time. AppStreamer can, therefore, keep only a small part of the files on the device, akin to a “cache”, and download the remainder from a cloud storage server or a nearby edge server when it predicts that the app will need them in the near future. AppStreamer continuously predicts file blocks for the near future as the user uses the app, and fetches them from the storage server before the user sees a stall due to missing resources. We implement AppStreamer at the Android file system layer. This ensures that the apps require no source code or modification, and the approach generalizes across apps. We evaluate AppStreamer using two popular games: Dead Effect 2, a 3D first-person shooter, and Fire Emblem Heroes, a 2D turn-based strategy role-playing game. Through a user study, 75% and 87% of the users respectively find that AppStreamer provides the same quality of user experience as the baseline where all files are stored on the device. AppStreamer cuts down the storage requirement by 87% for Dead Effect 2 and 86% for Fire Emblem Heroes.
Tasks
Published	2019-12-16
URL	https://arxiv.org/abs/2001.08169v1
PDF	https://arxiv.org/pdf/2001.08169v1.pdf
PWC	https://paperswithcode.com/paper/appstreamer-reducing-storage-requirements-of
Repo
Framework

The Eighth Dialog System Technology Challenge


Title	The Eighth Dialog System Technology Challenge
Authors	Seokhwan Kim, Michel Galley, Chulaka Gunasekara, Sungjin Lee, Adam Atkinson, Baolin Peng, Hannes Schulz, Jianfeng Gao, Jinchao Li, Mahmoud Adada, Minlie Huang, Luis Lastras, Jonathan K. Kummerfeld, Walter S. Lasecki, Chiori Hori, Anoop Cherian, Tim K. Marks, Abhinav Rastogi, Xiaoxue Zang, Srinivas Sunkara, Raghav Gupta
Abstract	This paper introduces the Eighth Dialog System Technology Challenge. In line with recent challenges, the eighth edition focuses on applying end-to-end dialog technologies in a pragmatic way for multi-domain task-completion, noetic response selection, audio visual scene-aware dialog, and schema-guided dialog state tracking tasks. This paper describes the task definition, provided datasets, and evaluation set-up for each track. We also summarize the results of the submitted systems to highlight the overall trends of the state-of-the-art technologies for the tasks.
Tasks
Published	2019-11-14
URL	https://arxiv.org/abs/1911.06394v1
PDF	https://arxiv.org/pdf/1911.06394v1.pdf
PWC	https://paperswithcode.com/paper/the-eighth-dialog-system-technology-challenge
Repo
Framework

Analyzing Knowledge Graph Embedding Methods from a Multi-Embedding Interaction Perspective


Title	Analyzing Knowledge Graph Embedding Methods from a Multi-Embedding Interaction Perspective
Authors	Hung Nghiep Tran, Atsuhiro Takasu
Abstract	Knowledge graph is a popular format for representing knowledge, with many applications to semantic search engines, question-answering systems, and recommender systems. Real-world knowledge graphs are usually incomplete, so knowledge graph embedding methods, such as Canonical decomposition/Parallel factorization (CP), DistMult, and ComplEx, have been proposed to address this issue. These methods represent entities and relations as embedding vectors in semantic space and predict the links between them. The embedding vectors themselves contain rich semantic information and can be used in other applications such as data analysis. However, mechanisms in these models and the embedding vectors themselves vary greatly, making it difficult to understand and compare them. Given this lack of understanding, we risk using them ineffectively or incorrectly, particularly for complicated models, such as CP, with two role-based embedding vectors, or the state-of-the-art ComplEx model, with complex-valued embedding vectors. In this paper, we propose a multi-embedding interaction mechanism as a new approach to uniting and generalizing these models. We derive them theoretically via this mechanism and provide empirical analyses and comparisons between them. We also propose a new multi-embedding model based on quaternion algebra and show that it achieves promising results using popular benchmarks.
Tasks	Graph Embedding, Knowledge Graph Embedding, Knowledge Graphs, Question Answering, Recommendation Systems
Published	2019-03-27
URL	https://arxiv.org/abs/1903.11406v2
PDF	https://arxiv.org/pdf/1903.11406v2.pdf
PWC	https://paperswithcode.com/paper/analyzing-knowledge-graph-embedding-methods
Repo
Framework

Leveraging Hierarchical Representations for Preserving Privacy and Utility in Text


Title	Leveraging Hierarchical Representations for Preserving Privacy and Utility in Text
Authors	Oluwaseyi Feyisetan, Tom Diethe, Thomas Drake
Abstract	Guaranteeing a certain level of user privacy in an arbitrary piece of text is a challenging issue. However, with this challenge comes the potential of unlocking access to vast data stores for training machine learning models and supporting data driven decisions. We address this problem through the lens of dx-privacy, a generalization of Differential Privacy to non Hamming distance metrics. In this work, we explore word representations in Hyperbolic space as a means of preserving privacy in text. We provide a proof satisfying dx-privacy, then we define a probability distribution in Hyperbolic space and describe a way to sample from it in high dimensions. Privacy is provided by perturbing vector representations of words in high dimensional Hyperbolic space to obtain a semantic generalization. We conduct a series of experiments to demonstrate the tradeoff between privacy and utility. Our privacy experiments illustrate protections against an authorship attribution algorithm while our utility experiments highlight the minimal impact of our perturbations on several downstream machine learning models. Compared to the Euclidean baseline, we observe > 20x greater guarantees on expected privacy against comparable worst case statistics.
Tasks
Published	2019-10-20
URL	https://arxiv.org/abs/1910.08917v1
PDF	https://arxiv.org/pdf/1910.08917v1.pdf
PWC	https://paperswithcode.com/paper/leveraging-hierarchical-representations-for
Repo
Framework

SpecAugment on Large Scale Datasets


Title	SpecAugment on Large Scale Datasets
Authors	Daniel S. Park, Yu Zhang, Chung-Cheng Chiu, Youzheng Chen, Bo Li, William Chan, Quoc V. Le, Yonghui Wu
Abstract	Recently, SpecAugment, an augmentation scheme for automatic speech recognition that acts directly on the spectrogram of input utterances, has shown to be highly effective in enhancing the performance of end-to-end networks on public datasets. In this paper, we demonstrate its effectiveness on tasks with large scale datasets by investigating its application to the Google Multidomain Dataset (Narayanan et al., 2018). We achieve improvement across all test domains by mixing raw training data augmented with SpecAugment and noise-perturbed training data when training the acoustic model. We also introduce a modification of SpecAugment that adapts the time mask size and/or multiplicity depending on the length of the utterance, which can potentially benefit large scale tasks. By using adaptive masking, we are able to further improve the performance of the Listen, Attend and Spell model on LibriSpeech to 2.2% WER on test-clean and 5.2% WER on test-other.
Tasks	Speech Recognition
Published	2019-12-11
URL	https://arxiv.org/abs/1912.05533v1
PDF	https://arxiv.org/pdf/1912.05533v1.pdf
PWC	https://paperswithcode.com/paper/specaugment-on-large-scale-datasets
Repo
Framework

Sequential Computer Experimental Design for Estimating an Extreme Probability or Quantile


Title	Sequential Computer Experimental Design for Estimating an Extreme Probability or Quantile
Authors	Hao Chen, William J. Welch
Abstract	A computer code can simulate a system’s propagation of variation from random inputs to output measures of quality. Our aim here is to estimate a critical output tail probability or quantile without a large Monte Carlo experiment. Instead, we build a statistical surrogate for the input-output relationship with a modest number of evaluations and then sequentially add further runs, guided by a criterion to improve the estimate. We compare two criteria in the literature. Moreover, we investigate two practical questions: how to design the initial code runs and how to model the input distribution. Hence, we close the gap between the theory of sequential design and its application.
Tasks
Published	2019-08-14
URL	https://arxiv.org/abs/1908.05357v1
PDF	https://arxiv.org/pdf/1908.05357v1.pdf
PWC	https://paperswithcode.com/paper/sequential-computer-experimental-design-for
Repo
Framework

Asymptotic Distributions and Rates of Convergence for Random Forests via Generalized U-statistics


Title	Asymptotic Distributions and Rates of Convergence for Random Forests via Generalized U-statistics
Authors	Wei Peng, Tim Coleman, Lucas Mentch
Abstract	Random forests remain among the most popular off-the-shelf supervised learning algorithms. Despite their well-documented empirical success, however, until recently, few theoretical results were available to describe their performance and behavior. In this work we push beyond recent work on consistency and asymptotic normality by establishing rates of convergence for random forests and other supervised learning ensembles. We develop the notion of generalized U-statistics and show that within this framework, random forest predictions can potentially remain asymptotically normal for larger subsample sizes than previously established. We also provide Berry-Esseen bounds in order to quantify the rate at which this convergence occurs, making explicit the roles of the subsample size and the number of trees in determining the distribution of random forest predictions.
Tasks
Published	2019-05-25
URL	https://arxiv.org/abs/1905.10651v2
PDF	https://arxiv.org/pdf/1905.10651v2.pdf
PWC	https://paperswithcode.com/paper/asymptotic-distributions-and-rates-of
Repo
Framework

Answer Extraction for Why Arabic Questions Answering Systems: EWAQ


Title	Answer Extraction for Why Arabic Questions Answering Systems: EWAQ
Authors	Fatima T. AL-Khawaldeh
Abstract	With the increasing amount of web information, questions answering systems becomes very important to allow users to access to direct answers for their requests. This paper presents an Arabic Questions Answering Systems based on entailment metrics. The type of questions which this paper focuses on is why questions. There are many reasons lead us to develop this system: generally, the lack of Arabic Questions Answering Systems and scarcity Arabic Questions Answering Systems which focus on why questions. The goal of the proposed system in this research is to extract answers from re-ranked retrieved passages which are retrieved by search engines. This system extracts the answer only to why questions. This system is called by EWAQ: Entailment based Why Arabic Questions Answering. Each answer is scored with entailment metrics and ranked according to their scores in order to determine the most possible correct answer. EWAQ is compared with search engines: yahoo, google and ask.com, the well-established web-based Questions Answering systems, using manual test set. In EWAQ experiments, it is showed that the accuracy is increased by implementing the textual entailment in re-raking the retrieved relevant passages by search engines and deciding the correct answer. The obtained results show that using entailment based similarity can help significantly to tackle the why Answer Extraction module in Arabic language.
Tasks	Natural Language Inference
Published	2019-07-04
URL	https://arxiv.org/abs/1907.04149v1
PDF	https://arxiv.org/pdf/1907.04149v1.pdf
PWC	https://paperswithcode.com/paper/answer-extraction-for-why-arabic-questions
Repo
Framework

Time2Graph: Revisiting Time Series Modeling with Dynamic Shapelets


Title	Time2Graph: Revisiting Time Series Modeling with Dynamic Shapelets
Authors	Ziqiang Cheng, Yang Yang, Wei Wang, Wenjie Hu, Yueting Zhuang, Guojie Song
Abstract	Time series modeling has attracted extensive research efforts; however, achieving both reliable efficiency and interpretability from a unified model still remains a challenging problem. Among the literature, shapelets offer interpretable and explanatory insights in the classification tasks, while most existing works ignore the differing representative power at different time slices, as well as (more importantly) the evolution pattern of shapelets. In this paper, we propose to extract time-aware shapelets by designing a two-level timing factor. Moreover, we define and construct the shapelet evolution graph, which captures how shapelets evolve over time and can be incorporated into the time series embeddings by graph embedding algorithms. To validate whether the representations obtained in this way can be applied effectively in various scenarios, we conduct experiments based on three public time series datasets, and two real-world datasets from different domains. Experimental results clearly show the improvements achieved by our approach compared with 17 state-of-the-art baselines.
Tasks	Graph Embedding, Time Series
Published	2019-11-11
URL	https://arxiv.org/abs/1911.04143v1
PDF	https://arxiv.org/pdf/1911.04143v1.pdf
PWC	https://paperswithcode.com/paper/time2graph-revisiting-time-series-modeling
Repo
Framework

Does Speech enhancement of publicly available data help build robust Speech Recognition Systems?


Title	Does Speech enhancement of publicly available data help build robust Speech Recognition Systems?
Authors	Bhavya Ghai, Buvana Ramanan, Klaus Mueller
Abstract	Automatic speech recognition (ASR) systems play a key role in many commercial products including voice assistants. Typically, they require large amounts of clean speech data for training which gives an undue advantage to large organizations which have tons of private data. In this paper, we have first curated a fairly big dataset using publicly available data sources. Thereafter, we tried to investigate if we can use publicly available noisy data to train robust ASR systems. We have used speech enhancement to clean the noisy data first and then used it together with its cleaned version to train ASR systems. We have found that using speech enhancement gives 9.5% better word error rate than training on just noisy data and 9% better than training on just clean data. It’s performance is also comparable to the ideal case scenario when trained on noisy and its clean version.
Tasks	Robust Speech Recognition, Speech Enhancement, Speech Recognition
Published	2019-10-29
URL	https://arxiv.org/abs/1910.13488v2
PDF	https://arxiv.org/pdf/1910.13488v2.pdf
PWC	https://paperswithcode.com/paper/does-speech-enhancement-of-publicly-available
Repo
Framework