April 2, 2020

2908 words 14 mins read

Paper Group ANR 187

Paper Group ANR 187

Locally-Adaptive Nonparametric Online Learning. UKARA 1.0 Challenge Track 1: Automatic Short-Answer Scoring in Bahasa Indonesia. Object-Based Image Coding: A Learning-Driven Revisit. Affinity and Diversity: Quantifying Mechanisms of Data Augmentation. Optimizing JPEG Quantization for Classification Networks. Unsupervised Temporal Feature Aggregatio …

Locally-Adaptive Nonparametric Online Learning

Title Locally-Adaptive Nonparametric Online Learning
Authors Ilja Kuzborskij, Nicolò Cesa-Bianchi
Abstract One of the main strengths of online algorithms is their ability to adapt to arbitrary data sequences. This is especially important in nonparametric settings, where regret is measured against rich classes of comparator functions that are able to fit complex environments. Although such hard comparators and complex environments may exhibit local regularities, efficient algorithms whose performance can provably take advantage of these local patterns are hardly known. We fill this gap introducing efficient online algorithms (based on a single versatile master algorithm) that adapt to: (1) local Lipschitzness of the competitor function, (2) local metric dimension of the instance sequence, (3) local performance of the predictor across different regions of the instance space. Extending previous approaches, we design algorithms that dynamically grow hierarchical packings of the instance space, and whose prunings correspond to different “locality profiles” for the problem at hand. Using a technique based on tree experts, we simultaneously and efficiently compete against all such prunings, and prove regret bounds scaling with quantities associated with all three types of local regularities. When competing against “simple” locality profiles, our technique delivers regret bounds that are significantly better than those proven using the previous approach. On the other hand, the time dependence of our bounds is not worse than that obtained by ignoring any local regularities.
Tasks
Published 2020-02-05
URL https://arxiv.org/abs/2002.01882v1
PDF https://arxiv.org/pdf/2002.01882v1.pdf
PWC https://paperswithcode.com/paper/locally-adaptive-nonparametric-online
Repo
Framework

UKARA 1.0 Challenge Track 1: Automatic Short-Answer Scoring in Bahasa Indonesia

Title UKARA 1.0 Challenge Track 1: Automatic Short-Answer Scoring in Bahasa Indonesia
Authors Ali Akbar Septiandri, Yosef Ardhito Winatmoko
Abstract We describe our third-place solution to the UKARA 1.0 challenge on automated essay scoring. The task consists of a binary classification problem on two datasets answers from two different questions. We ended up using two different models for the two datasets. For task A, we applied a random forest algorithm on features extracted using unigram with latent semantic analysis (LSA). On the other hand, for task B, we only used logistic regression on TF-IDF features. Our model results in F1 score of 0.812.
Tasks
Published 2020-02-28
URL https://arxiv.org/abs/2002.12540v1
PDF https://arxiv.org/pdf/2002.12540v1.pdf
PWC https://paperswithcode.com/paper/ukara-10-challenge-track-1-automatic-short
Repo
Framework

Object-Based Image Coding: A Learning-Driven Revisit

Title Object-Based Image Coding: A Learning-Driven Revisit
Authors Qi Xia, Haojie Liu, Zhan Ma
Abstract The Object-Based Image Coding (OBIC) that was extensively studied about two decades ago, promised a vast application perspective for both ultra-low bitrate communication and high-level semantical content understanding, but it had rarely been used due to the inefficient compact representation of object with arbitrary shape. A fundamental issue behind is how to efficiently process the arbitrary-shaped objects at a fine granularity (e.g., feature element or pixel wise). To attack this, we have proposed to apply the element-wise masking and compression by devising an object segmentation network for image layer decomposition, and parallel convolution-based neural image compression networks to process masked foreground objects and background scene separately. All components are optimized in an end-to-end learning framework to intelligently weigh their (e.g., object and background) contributions for visually pleasant reconstruction. We have conducted comprehensive experiments to evaluate the performance on PASCAL VOC dataset at a very low bitrate scenario (e.g., $\lesssim$0.1 bits per pixel - bpp) which have demonstrated noticeable subjective quality improvement compared with JPEG2K, HEVC-based BPG and another learned image compression method. All relevant materials are made publicly accessible at https://njuvision.github.io/Neural-Object-Coding/.
Tasks Image Compression, Semantic Segmentation
Published 2020-03-18
URL https://arxiv.org/abs/2003.08033v1
PDF https://arxiv.org/pdf/2003.08033v1.pdf
PWC https://paperswithcode.com/paper/object-based-image-coding-a-learning-driven
Repo
Framework

Affinity and Diversity: Quantifying Mechanisms of Data Augmentation

Title Affinity and Diversity: Quantifying Mechanisms of Data Augmentation
Authors Raphael Gontijo-Lopes, Sylvia J. Smullin, Ekin D. Cubuk, Ethan Dyer
Abstract Though data augmentation has become a standard component of deep neural network training, the underlying mechanism behind the effectiveness of these techniques remains poorly understood. In practice, augmentation policies are often chosen using heuristics of either distribution shift or augmentation diversity. Inspired by these, we seek to quantify how data augmentation improves model generalization. To this end, we introduce interpretable and easy-to-compute measures: Affinity and Diversity. We find that augmentation performance is predicted not by either of these alone but by jointly optimizing the two.
Tasks Data Augmentation
Published 2020-02-20
URL https://arxiv.org/abs/2002.08973v1
PDF https://arxiv.org/pdf/2002.08973v1.pdf
PWC https://paperswithcode.com/paper/affinity-and-diversity-quantifying-mechanisms
Repo
Framework

Optimizing JPEG Quantization for Classification Networks

Title Optimizing JPEG Quantization for Classification Networks
Authors Zhijing Li, Christopher De Sa, Adrian Sampson
Abstract Deep learning for computer vision depends on lossy image compression: it reduces the storage required for training and test data and lowers transfer costs in deployment. Mainstream datasets and imaging pipelines all rely on standard JPEG compression. In JPEG, the degree of quantization of frequency coefficients controls the lossiness: an 8 by 8 quantization table (Q-table) decides both the quality of the encoded image and the compression ratio. While a long history of work has sought better Q-tables, existing work either seeks to minimize image distortion or to optimize for models of the human visual system. This work asks whether JPEG Q-tables exist that are “better” for specific vision networks and can offer better quality–size trade-offs than ones designed for human perception or minimal distortion. We reconstruct an ImageNet test set with higher resolution to explore the effect of JPEG compression under novel Q-tables. We attempt several approaches to tune a Q-table for a vision task. We find that a simple sorted random sampling method can exceed the performance of the standard JPEG Q-table. We also use hyper-parameter tuning techniques including bounded random search, Bayesian optimization, and composite heuristic optimization methods. The new Q-tables we obtained can improve the compression rate by 10% to 200% when the accuracy is fixed, or improve accuracy up to $2%$ at the same compression rate.
Tasks Image Compression, Quantization
Published 2020-03-05
URL https://arxiv.org/abs/2003.02874v1
PDF https://arxiv.org/pdf/2003.02874v1.pdf
PWC https://paperswithcode.com/paper/optimizing-jpeg-quantization-for
Repo
Framework

Unsupervised Temporal Feature Aggregation for Event Detection in Unstructured Sports Videos

Title Unsupervised Temporal Feature Aggregation for Event Detection in Unstructured Sports Videos
Authors Subhajit Chaudhury, Daiki Kimura, Phongtharin Vinayavekhin, Asim Munawar, Ryuki Tachibana, Koji Ito, Yuki Inaba, Minoru Matsumoto, Shuji Kidokoro, Hiroki Ozaki
Abstract Image-based sports analytics enable automatic retrieval of key events in a game to speed up the analytics process for human experts. However, most existing methods focus on structured television broadcast video datasets with a straight and fixed camera having minimum variability in the capturing pose. In this paper, we study the case of event detection in sports videos for unstructured environments with arbitrary camera angles. The transition from structured to unstructured video analysis produces multiple challenges that we address in our paper. Specifically, we identify and solve two major problems: unsupervised identification of players in an unstructured setting and generalization of the trained models to pose variations due to arbitrary shooting angles. For the first problem, we propose a temporal feature aggregation algorithm using person re-identification features to obtain high player retrieval precision by boosting a weak heuristic scoring method. Additionally, we propose a data augmentation technique, based on multi-modal image translation model, to reduce bias in the appearance of training samples. Experimental evaluations show that our proposed method improves precision for player retrieval from 0.78 to 0.86 for obliquely angled videos. Additionally, we obtain an improvement in F1 score for rally detection in table tennis videos from 0.79 in case of global frame-level features to 0.89 using our proposed player-level features. Please see the supplementary video submission at https://ibm.biz/BdzeZA.
Tasks Data Augmentation, Person Re-Identification
Published 2020-02-19
URL https://arxiv.org/abs/2002.08097v1
PDF https://arxiv.org/pdf/2002.08097v1.pdf
PWC https://paperswithcode.com/paper/unsupervised-temporal-feature-aggregation-for
Repo
Framework

Generalized Octave Convolutions for Learned Multi-Frequency Image Compression

Title Generalized Octave Convolutions for Learned Multi-Frequency Image Compression
Authors Mohammad Akbari, Jie Liang, Jingning Han, Chengjie Tu
Abstract Learned image compression has recently shown the potential to outperform all standard codecs. The state-of-the-art rate-distortion performance has been achieved by context-adaptive entropy approaches in which hyperprior and autoregressive models are jointly utilized to effectively capture the spatial dependencies in the latent representations. However, the latents contain a mixture of high and low frequency information, which has inefficiently been represented by features maps of the same spatial resolution in previous works. In this paper, we propose the first learned multi-frequency image compression approach that uses the recently developed octave convolutions to factorize the latents into high and low frequencies. Since the low frequency is represented by a lower resolution, their spatial redundancy is reduced, which improves the compression rate. Moreover, octave convolutions impose effective high and low frequency communication, which can improve the reconstruction quality. We also develop novel generalized octave convolution and octave transposed-convolution architectures with internal activation layers to preserve the spatial structure of the information. Our experiments show that the proposed scheme outperforms all standard codecs and learning-based methods in both PSNR and MS-SSIM metrics, and establishes the new state of the art for learned image compression.
Tasks Image Compression
Published 2020-02-24
URL https://arxiv.org/abs/2002.10032v1
PDF https://arxiv.org/pdf/2002.10032v1.pdf
PWC https://paperswithcode.com/paper/generalized-octave-convolutions-for-learned
Repo
Framework

Investigating an approach for low resource language dataset creation, curation and classification: Setswana and Sepedi

Title Investigating an approach for low resource language dataset creation, curation and classification: Setswana and Sepedi
Authors Vukosi Marivate, Tshephisho Sefara, Vongani Chabalala, Keamogetswe Makhaya, Tumisho Mokgonyane, Rethabile Mokoena, Abiodun Modupe
Abstract The recent advances in Natural Language Processing have been a boon for well-represented languages in terms of available curated data and research resources. One of the challenges for low-resourced languages is clear guidelines on the collection, curation and preparation of datasets for different use-cases. In this work, we take on the task of creation of two datasets that are focused on news headlines (i.e short text) for Setswana and Sepedi and creation of a news topic classification task. We document our work and also present baselines for classification. We investigate an approach on data augmentation, better suited to low resource languages, to improve the performance of the classifiers
Tasks Data Augmentation
Published 2020-02-18
URL https://arxiv.org/abs/2003.04986v1
PDF https://arxiv.org/pdf/2003.04986v1.pdf
PWC https://paperswithcode.com/paper/investigating-an-approach-for-low-resource
Repo
Framework

A Comparative Study on Parameter Estimation in Software Reliability Modeling using Swarm Intelligence

Title A Comparative Study on Parameter Estimation in Software Reliability Modeling using Swarm Intelligence
Authors Najla Akram AL-Saati, Marrwa Abd-AlKareem Alabajee
Abstract This work focuses on a comparison between the performances of two well-known Swarm algorithms: Cuckoo Search (CS) and Firefly Algorithm (FA), in estimating the parameters of Software Reliability Growth Models. This study is further reinforced using Particle Swarm Optimization (PSO) and Ant Colony Optimization (ACO). All algorithms are evaluated according to real software failure data, the tests are performed and the obtained results are compared to show the performance of each of the used algorithms. Furthermore, CS and FA are also compared with each other on bases of execution time and iteration number. Experimental results show that CS is more efficient in estimating the parameters of SRGMs, and it has outperformed FA in addition to PSO and ACO for the selected Data sets and employed models.
Tasks
Published 2020-03-08
URL https://arxiv.org/abs/2003.04770v1
PDF https://arxiv.org/pdf/2003.04770v1.pdf
PWC https://paperswithcode.com/paper/a-comparative-study-on-parameter-estimation
Repo
Framework

Systematic Review of Approaches to Improve Peer Assessment at Scale

Title Systematic Review of Approaches to Improve Peer Assessment at Scale
Authors Manikandan Ravikiran
Abstract Peer Assessment is a task of analysis and commenting on student’s writing by peers, is core of all educational components both in campus and in MOOC’s. However, with the sheer scale of MOOC’s & its inherent personalised open ended learning, automatic grading and tools assisting grading at scale is highly important. Previously we presented survey on tasks of post classification, knowledge tracing and ended with brief review on Peer Assessment (PA), with some initial problems. In this review we shall continue review on PA from perspective of improving the review process itself. As such rest of this review focus on three facets of PA namely Auto grading and Peer Assessment Tools (we shall look only on how peer reviews/auto-grading is carried), strategies to handle Rogue Reviews, Peer Review Improvement using Natural Language Processing. The consolidated set of papers and resources so used are released in https://github.com/manikandan-ravikiran/cs6460-Survey-2.
Tasks Knowledge Tracing
Published 2020-01-27
URL https://arxiv.org/abs/2001.10617v1
PDF https://arxiv.org/pdf/2001.10617v1.pdf
PWC https://paperswithcode.com/paper/systematic-review-of-approaches-to-improve
Repo
Framework

Nonconvex sparse regularization for deep neural networks and its optimality

Title Nonconvex sparse regularization for deep neural networks and its optimality
Authors Ilsang Ohn, Yongdai Kim
Abstract Recent theoretical studies proved that deep neural network (DNN) estimators obtained by minimizing empirical risk with a certain sparsity constraint can attain optimal convergence rates for regression and classification problems. However, the sparsity constraint requires to know certain properties of the true model, which are not available in practice. Moreover, computation is difficult due to the discrete nature of the sparsity constraint. In this paper, we propose a novel penalized estimation method for sparse DNNs, which resolves the aforementioned problems existing in the sparsity constraint. We establish an oracle inequality for the excess risk of the proposed sparse-penalized DNN estimator and derive convergence rates for several learning tasks. In particular, we prove that the sparse-penalized estimator can adaptively attain minimax convergence rates for various nonparametric regression problems. For computation, we develop an efficient gradient-based optimization algorithm that guarantees the monotonic reduction of the objective function.
Tasks
Published 2020-03-26
URL https://arxiv.org/abs/2003.11769v1
PDF https://arxiv.org/pdf/2003.11769v1.pdf
PWC https://paperswithcode.com/paper/nonconvex-sparse-regularization-for-deep
Repo
Framework

Automatic Melody Harmonization with Triad Chords: A Comparative Study

Title Automatic Melody Harmonization with Triad Chords: A Comparative Study
Authors Yin-Cheng Yeh, Wen-Yi Hsiao, Satoru Fukayama, Tetsuro Kitahara, Benjamin Genchel, Hao-Min Liu, Hao-Wen Dong, Yian Chen, Terence Leong, Yi-Hsuan Yang
Abstract Several prior works have proposed various methods for the task of automatic melody harmonization, in which a model aims to generate a sequence of chords to serve as the harmonic accompaniment of a given multiple-bar melody sequence. In this paper, we present a comparative study evaluating and comparing the performance of a set of canonical approaches to this task, including a template matching based model, a hidden Markov based model, a genetic algorithm based model, and two deep learning based models. The evaluation is conducted on a dataset of 9,226 melody/chord pairs we newly collect for this study, considering up to 48 triad chords, using a standardized training/test split. We report the result of an objective evaluation using six different metrics and a subjective study with 202 participants.
Tasks
Published 2020-01-08
URL https://arxiv.org/abs/2001.02360v1
PDF https://arxiv.org/pdf/2001.02360v1.pdf
PWC https://paperswithcode.com/paper/automatic-melody-harmonization-with-triad
Repo
Framework

I Feel I Feel You: A Theory of Mind Experiment in Games

Title I Feel I Feel You: A Theory of Mind Experiment in Games
Authors David Melhart, Georgios N. Yannakakis, Antonios Liapis
Abstract In this study into the player’s emotional theory of mind of gameplaying agents, we investigate how an agent’s behaviour and the player’s own performance and emotions shape the recognition of a frustrated behaviour. We focus on the perception of frustration as it is a prevalent affective experience in human-computer interaction. We present a testbed game tailored towards this end, in which a player competes against an agent with a frustration model based on theory. We collect gameplay data, an annotated ground truth about the player’s appraisal of the agent’s frustration, and apply face recognition to estimate the player’s emotional state. We examine the collected data through correlation analysis and predictive machine learning models, and find that the player’s observable emotions are not correlated highly with the perceived frustration of the agent. This suggests that our subject’s theory of mind is a cognitive process based on the gameplay context. Our predictive models—using ranking support vector machines—corroborate these results, yielding moderately accurate predictors of players’ theory of mind.
Tasks Face Recognition
Published 2020-01-23
URL https://arxiv.org/abs/2001.08656v1
PDF https://arxiv.org/pdf/2001.08656v1.pdf
PWC https://paperswithcode.com/paper/i-feel-i-feel-you-a-theory-of-mind-experiment
Repo
Framework

WaveTTS: Tacotron-based TTS with Joint Time-Frequency Domain Loss

Title WaveTTS: Tacotron-based TTS with Joint Time-Frequency Domain Loss
Authors Rui Liu, Berrak Sisman, Feilong Bao, Guanglai Gao, Haizhou Li
Abstract Tacotron-based text-to-speech (TTS) systems directly synthesize speech from text input. Such frameworks typically consist of a feature prediction network that maps character sequences to frequency-domain acoustic features, followed by a waveform reconstruction algorithm or a neural vocoder that generates the time-domain waveform from acoustic features. As the loss function is usually calculated only for frequency-domain acoustic features, that doesn’t directly control the quality of the generated time-domain waveform. To address this problem, we propose a new training scheme for Tacotron-based TTS, referred to as WaveTTS, that has 2 loss functions: 1) time-domain loss, denoted as the waveform loss, that measures the distortion between the natural and generated waveform; and 2) frequency-domain loss, that measures the Mel-scale acoustic feature loss between the natural and generated acoustic features. WaveTTS ensures both the quality of the acoustic features and the resulting speech waveform. To our best knowledge, this is the first implementation of Tacotron with joint time-frequency domain loss. Experimental results show that the proposed framework outperforms the baselines and achieves high-quality synthesized speech.
Tasks
Published 2020-02-02
URL https://arxiv.org/abs/2002.00417v1
PDF https://arxiv.org/pdf/2002.00417v1.pdf
PWC https://paperswithcode.com/paper/wavetts-tacotron-based-tts-with-joint-time
Repo
Framework

CEB Improves Model Robustness

Title CEB Improves Model Robustness
Authors Ian Fischer, Alexander A. Alemi
Abstract We demonstrate that the Conditional Entropy Bottleneck (CEB) can improve model robustness. CEB is an easy strategy to implement and works in tandem with data augmentation procedures. We report results of a large scale adversarial robustness study on CIFAR-10, as well as the ImageNet-C Common Corruptions Benchmark, ImageNet-A, and PGD attacks.
Tasks Data Augmentation
Published 2020-02-13
URL https://arxiv.org/abs/2002.05380v1
PDF https://arxiv.org/pdf/2002.05380v1.pdf
PWC https://paperswithcode.com/paper/ceb-improves-model-robustness-1
Repo
Framework
comments powered by Disqus