January 29, 2020

3048 words 15 mins read

Paper Group ANR 707

Paper Group ANR 707

Relational Action Forecasting. Cascaded Gaussian Processes for Data-efficient Robot Dynamics Learning. Coherent and Controllable Outfit Generation. Near-Convex Archetypal Analysis. Universal Pooling – A New Pooling Method for Convolutional Neural Networks. MASS-UMAP: Fast and accurate analog ensemble search in weather radar archive. A comparative …

Relational Action Forecasting

Title Relational Action Forecasting
Authors Chen Sun, Abhinav Shrivastava, Carl Vondrick, Rahul Sukthankar, Kevin Murphy, Cordelia Schmid
Abstract This paper focuses on multi-person action forecasting in videos. More precisely, given a history of H previous frames, the goal is to detect actors and to predict their future actions for the next T frames. Our approach jointly models temporal and spatial interactions among different actors by constructing a recurrent graph, using actor proposals obtained with Faster R-CNN as nodes. Our method learns to select a subset of discriminative relations without requiring explicit supervision, thus enabling us to tackle challenging visual data. We refer to our model as Discriminative Relational Recurrent Network (DRRN). Evaluation of action prediction on AVA demonstrates the effectiveness of our proposed method compared to simpler baselines. Furthermore, we significantly improve performance on the task of early action classification on J-HMDB, from the previous SOTA of 48% to 60%.
Tasks Action Classification, Action Recognition In Videos
Published 2019-04-08
URL http://arxiv.org/abs/1904.04231v1
PDF http://arxiv.org/pdf/1904.04231v1.pdf
PWC https://paperswithcode.com/paper/relational-action-forecasting
Repo
Framework

Cascaded Gaussian Processes for Data-efficient Robot Dynamics Learning

Title Cascaded Gaussian Processes for Data-efficient Robot Dynamics Learning
Authors Sahand Rezaei-Shoshtari, David Meger, Inna Sharf
Abstract Motivated by the recursive Newton-Euler formulation, we propose a novel cascaded Gaussian process learning framework for the inverse dynamics of robot manipulators. This approach leads to a significant dimensionality reduction which in turn results in better learning and data efficiency. We explore two formulations for the cascading: the inward and outward, both along the manipulator chain topology. The learned modeling is tested in conjunction with the classical inverse dynamics model (semi-parametric) and on its own (non-parametric) in the context of feed-forward control of the arm. Experimental results are obtained with Jaco 2 six-DOF and SARCOS seven-DOF manipulators for randomly defined sinusoidal motions of the joints in order to evaluate the performance of cascading against the standard GP learning. In addition, experiments are conducted using Jaco 2 on a task emulating a pouring maneuver. Results indicate a consistent improvement in learning speed with the inward cascaded GP model and an overall improvement in data efficiency and generalization.
Tasks Dimensionality Reduction, Gaussian Processes
Published 2019-10-05
URL https://arxiv.org/abs/1910.02291v1
PDF https://arxiv.org/pdf/1910.02291v1.pdf
PWC https://paperswithcode.com/paper/cascaded-gaussian-processes-for-data
Repo
Framework

Coherent and Controllable Outfit Generation

Title Coherent and Controllable Outfit Generation
Authors Kedan Li, Chen Liu, David Forsyth
Abstract When thinking about dressing oneself, people often have a theme in mind whether they’re going to a tropical getaway or wish to appear attractive at a cocktail party. A useful outfit generation system should come up with clothing items that are compatible while matching a theme specified by the user. Existing methods use item-wise compatibility between products but lack an effective way to enforce a global constraint (e.g., style, occasion). We introduce a method that generates outfits whose items match a theme described by a text query. Our method uses text and image embeddings to represent fashion items. We learn a multimodal embedding where the image representation for an item is close to its text representation, and use this embedding to measure item-query coherence. We then use a discriminator to compute compatibility between fashion items. This strategy yields a compatibility prediction method that meets or exceeds the state of the art. Our method combines item-item compatibility and item-query coherence to construct an outfit whose items are (a) close to the query and (b) compatible with one another. Quantitative evaluation shows that the items in our outfits are tightly clustered compared to standard outfits. Furthermore, outfits produced by similar queries are close to one another, and outfits produced by very different queries are far apart. Qualitative evaluation shows that our method responds well to queries. A user study suggests that people understand the match between the queries and the outfits produced by our method.
Tasks
Published 2019-06-17
URL https://arxiv.org/abs/1906.07273v2
PDF https://arxiv.org/pdf/1906.07273v2.pdf
PWC https://paperswithcode.com/paper/using-discriminative-methods-to-learn-fashion
Repo
Framework

Near-Convex Archetypal Analysis

Title Near-Convex Archetypal Analysis
Authors Pierre De Handschutter, Nicolas Gillis, Arnaud Vandaele, Xavier Siebert
Abstract Nonnegative matrix factorization (NMF) is a widely used linear dimensionality reduction technique for nonnegative data. NMF requires that each data point is approximated by a convex combination of basis elements. Archetypal analysis (AA), also referred to as convex NMF, is a well-known NMF variant imposing that the basis elements are themselves convex combinations of the data points. AA has the advantage to be more interpretable than NMF because the basis elements are directly constructed from the data points. However, it usually suffers from a high data fitting error because the basis elements are constrained to be contained in the convex cone of the data points. In this letter, we introduce near-convex archetypal analysis (NCAA) which combines the advantages of both AA and NMF. As for AA, the basis vectors are required to be linear combinations of the data points and hence are easily interpretable. As for NMF, the additional flexibility in choosing the basis elements allows NCAA to have a low data fitting error. We show that NCAA compares favorably with a state-of-the-art minimum-volume NMF method on synthetic datasets and on a real-world hyperspectral image.
Tasks Dimensionality Reduction
Published 2019-10-02
URL https://arxiv.org/abs/1910.00821v1
PDF https://arxiv.org/pdf/1910.00821v1.pdf
PWC https://paperswithcode.com/paper/near-convex-archetypal-analysis
Repo
Framework

Universal Pooling – A New Pooling Method for Convolutional Neural Networks

Title Universal Pooling – A New Pooling Method for Convolutional Neural Networks
Authors Junhyuk Hyun, Hongje Seong, Euntai Kim
Abstract Pooling is one of the main elements in convolutional neural networks. The pooling reduces the size of the feature map, enabling training and testing with a limited amount of computation. This paper proposes a new pooling method named universal pooling. Unlike the existing pooling methods such as average pooling, max pooling, and stride pooling with fixed pooling function, universal pooling generates any pooling function, depending on a given problem and dataset. Universal pooling was inspired by attention methods and can be considered as a channel-wise form of local spatial attention. Universal pooling is trained jointly with the main network and it is shown that it includes the existing pooling methods. Finally, when applied to two benchmark problems, the proposed method outperformed the existing pooling methods and performed with the expected diversity, adapting to the given problem.
Tasks
Published 2019-07-26
URL https://arxiv.org/abs/1907.11440v1
PDF https://arxiv.org/pdf/1907.11440v1.pdf
PWC https://paperswithcode.com/paper/universal-pooling-a-new-pooling-method-for
Repo
Framework

MASS-UMAP: Fast and accurate analog ensemble search in weather radar archive

Title MASS-UMAP: Fast and accurate analog ensemble search in weather radar archive
Authors Gabriele Franch, Giuseppe Jurman, Luca Coviello, Marta Pendesini, Cesare Furlanello
Abstract The use of analogs - similar weather patterns - for weather forecasting and analysis is an established method in meteorology. The most challenging aspect of using this approach in the context of operational radar applications is to be able to perform a fast and accurate search for similar spatiotemporal precipitation patterns in a large archive of historical records. In this context, sequential pairwise search is too slow and computationally expensive. Here we propose an architecture to significantly speed-up spatiotemporal analog retrieval by combining nonlinear geometric dimensionality reduction (UMAP) with the fastest known Euclidean search algorithm for time series (MASS) to find radar analogs in constant time, independently of the desired temporal length to match and the number of extracted analogs. We compare UMAP with Principal component analysis (PCA) and show that UMAP outperforms PCA for spatial MSE analog search with proper settings. Moreover, we show that MASS is 20 times faster than brute force search on the UMAP embeddings space. We test the architecture on a real dataset and show that it enables precise and fast operational analog ensemble search through more than 2 years of radar archive in less than 5 seconds on a single workstation.
Tasks Dimensionality Reduction, Time Series, Weather Forecasting
Published 2019-10-01
URL https://arxiv.org/abs/1910.01211v1
PDF https://arxiv.org/pdf/1910.01211v1.pdf
PWC https://paperswithcode.com/paper/mass-umap-fast-and-accurate-analog-ensemble
Repo
Framework

A comparative study of general fuzzy min-max neural networks for pattern classification problems

Title A comparative study of general fuzzy min-max neural networks for pattern classification problems
Authors Thanh Tung Khuat, Bogdan Gabrys
Abstract General fuzzy min-max (GFMM) neural network is a generalization of fuzzy neural networks formed by hyperbox fuzzy sets for classification and clustering problems. Two principle algorithms are deployed to train this type of neural network, i.e., incremental learning and agglomerative learning. This paper presents a comprehensive empirical study of performance influencing factors, advantages, and drawbacks of the general fuzzy min-max neural network on pattern classification problems. The subjects of this study include (1) the impact of maximum hyperbox size, (2) the influence of the similarity threshold and measures on the agglomerative learning algorithm, (3) the effect of data presentation order, (4) comparative performance evaluation of the GFMM with other types of fuzzy min-max neural networks and prevalent machine learning algorithms. The experimental results on benchmark datasets widely used in machine learning showed overall strong and weak points of the GFMM classifier. These outcomes also informed potential research directions for this class of machine learning algorithms in the future.
Tasks
Published 2019-07-31
URL https://arxiv.org/abs/1907.13308v2
PDF https://arxiv.org/pdf/1907.13308v2.pdf
PWC https://paperswithcode.com/paper/a-comparative-study-of-general-fuzzy-min-max
Repo
Framework

Conclusion-Supplement Answer Generation for Non-Factoid Questions

Title Conclusion-Supplement Answer Generation for Non-Factoid Questions
Authors Makoto Nakatsuji, Sohei Okui
Abstract This paper tackles the goal of conclusion-supplement answer generation for non-factoid questions, which is a critical issue in the field of Natural Language Processing (NLP) and Artificial Intelligence (AI), as users often require supplementary information before accepting a conclusion. The current encoder-decoder framework, however, has difficulty generating such answers, since it may become confused when it tries to learn several different long answers to the same non-factoid question. Our solution, called an ensemble network, goes beyond single short sentences and fuses logically connected conclusion statements and supplementary statements. It extracts the context from the conclusion decoder’s output sequence and uses it to create supplementary decoder states on the basis of an attention mechanism. It also assesses the closeness of the question encoder’s output sequence and the separate outputs of the conclusion and supplement decoders as well as their combination. As a result, it generates answers that match the questions and have natural-sounding supplementary sequences in line with the context expressed by the conclusion sequence. Evaluations conducted on datasets including “Love Advice” and “Arts & Humanities” categories indicate that our model outputs much more accurate results than the tested baseline models do.
Tasks
Published 2019-11-25
URL https://arxiv.org/abs/1912.00864v1
PDF https://arxiv.org/pdf/1912.00864v1.pdf
PWC https://paperswithcode.com/paper/conclusion-supplement-answer-generation-for
Repo
Framework

Task-Based Learning

Title Task-Based Learning
Authors Di Chen, Yada Zhu, Xiaodong Cui, Carla P. Gomes
Abstract This paper talks about task-based learning.
Tasks Decision Making
Published 2019-10-17
URL https://arxiv.org/abs/1910.09357v3
PDF https://arxiv.org/pdf/1910.09357v3.pdf
PWC https://paperswithcode.com/paper/task-based-learning-via-task-oriented
Repo
Framework

Learning Disentangled Representations of Timbre and Pitch for Musical Instrument Sounds Using Gaussian Mixture Variational Autoencoders

Title Learning Disentangled Representations of Timbre and Pitch for Musical Instrument Sounds Using Gaussian Mixture Variational Autoencoders
Authors Yin-Jyun Luo, Kat Agres, Dorien Herremans
Abstract In this paper, we learn disentangled representations of timbre and pitch for musical instrument sounds. We adapt a framework based on variational autoencoders with Gaussian mixture latent distributions. Specifically, we use two separate encoders to learn distinct latent spaces for timbre and pitch, which form Gaussian mixture components representing instrument identity and pitch, respectively. For reconstruction, latent variables of timbre and pitch are sampled from corresponding mixture components, and are concatenated as the input to a decoder. We show the model efficacy by latent space visualization, and a quantitative analysis indicates the discriminability of these spaces, even with a limited number of instrument labels for training. The model allows for controllable synthesis of selected instrument sounds by sampling from the latent spaces. To evaluate this, we trained instrument and pitch classifiers using original labeled data. These classifiers achieve high accuracy when tested on our synthesized sounds, which verifies the model performance of controllable realistic timbre and pitch synthesis. Our model also enables timbre transfer between multiple instruments, with a single autoencoder architecture, which is evaluated by measuring the shift in posterior of instrument classification. Our in depth evaluation confirms the model ability to successfully disentangle timbre and pitch.
Tasks
Published 2019-06-19
URL https://arxiv.org/abs/1906.08152v2
PDF https://arxiv.org/pdf/1906.08152v2.pdf
PWC https://paperswithcode.com/paper/learning-disentangled-representations-of-3
Repo
Framework

Testing Markov Chains without Hitting

Title Testing Markov Chains without Hitting
Authors Yeshwanth Cherapanamjeri, Peter L. Bartlett
Abstract We study the problem of identity testing of markov chains. In this setting, we are given access to a single trajectory from a markov chain with unknown transition matrix $Q$ and the goal is to determine whether $Q = P$ for some known matrix $P$ or $\text{Dist}(P, Q) \geq \epsilon$ where $\text{Dist}$ is suitably defined. In recent work by Daskalakis, Dikkala and Gravin, 2018, it was shown that it is possible to distinguish between the two cases provided the length of the observed trajectory is at least super-linear in the hitting time of $P$ which may be arbitrarily large. In this paper, we propose an algorithm that avoids this dependence on hitting time thus enabling efficient testing of markov chains even in cases where it is infeasible to observe every state in the chain. Our algorithm is based on combining classical ideas from approximation algorithms with techniques for the spectral analysis of markov chains.
Tasks
Published 2019-02-06
URL http://arxiv.org/abs/1902.01999v1
PDF http://arxiv.org/pdf/1902.01999v1.pdf
PWC https://paperswithcode.com/paper/testing-markov-chains-without-hitting
Repo
Framework

Deep Radar Waveform Design for Efficient Automotive Radar Sensing

Title Deep Radar Waveform Design for Efficient Automotive Radar Sensing
Authors Shahin Khobahi, Arindam Bose, Mojtaba Soltanalian
Abstract In radar systems, unimodular (or constant-modulus) waveform design plays an important role in achieving better clutter/interference rejection, as well as a more accurate estimation of the target parameters. The design of such sequences has been studied widely in the last few decades, with most design algorithms requiring sophisticated a priori knowledge of environmental parameters which may be difficult to obtain in real-time scenarios. In this paper, we propose a novel hybrid model-driven and data-driven architecture that adapts to the ever changing environment and allows for adaptive unimodular waveform design. In particular, the approach lays the groundwork for developing extremely low-cost waveform design and processing frameworks for radar systems deployed in autonomous vehicles. The proposed model-based deep architecture imitates a well-known unimodular signal design algorithm in its structure, and can quickly infer statistical information from the environment using the observed data. Our numerical experiments portray the advantages of using the proposed method for efficient radar waveform design in time-varying environments.
Tasks Autonomous Vehicles
Published 2019-12-17
URL https://arxiv.org/abs/1912.08180v2
PDF https://arxiv.org/pdf/1912.08180v2.pdf
PWC https://paperswithcode.com/paper/deep-radar-waveform-design-for-efficient
Repo
Framework

Assessing the Applicability of Authorship Verification Methods

Title Assessing the Applicability of Authorship Verification Methods
Authors Oren Halvani, Christian Winter, Lukas Graner
Abstract Authorship verification (AV) is a research subject in the field of digital text forensics that concerns itself with the question, whether two documents have been written by the same person. During the past two decades, an increasing number of proposed AV approaches can be observed. However, a closer look at the respective studies reveals that the underlying characteristics of these methods are rarely addressed, which raises doubts regarding their applicability in real forensic settings. The objective of this paper is to fill this gap by proposing clear criteria and properties that aim to improve the characterization of existing and future AV approaches. Based on these properties, we conduct three experiments using 12 existing AV approaches, including the current state of the art. The examined methods were trained, optimized and evaluated on three self-compiled corpora, where each corpus focuses on a different aspect of applicability. Our results indicate that part of the methods are able to cope with very challenging verification cases such as 250 characters long informal chat conversations (72.7% accuracy) or cases in which two scientific documents were written at different times with an average difference of 15.6 years (> 75% accuracy). However, we also identified that all involved methods are prone to cross-topic verification cases.
Tasks
Published 2019-06-24
URL https://arxiv.org/abs/1906.10551v1
PDF https://arxiv.org/pdf/1906.10551v1.pdf
PWC https://paperswithcode.com/paper/assessing-the-applicability-of-authorship
Repo
Framework

A Dataset for measuring reading levels in India at scale

Title A Dataset for measuring reading levels in India at scale
Authors Dolly Agarwal, Jayant Gupchup, Nishant Baghel
Abstract One out of four children in India are leaving grade eight without basic reading skills. Measuring the reading levels in a vast country like India poses significant hurdles. Recent advances in machine learning opens up the possibility of automating this task. However, the datasets of children’s speech are not only rare but are primarily in English. To solve this assessment problem and advance deep learning research in regional Indian languages, we present the ASER dataset of children in the age group of 6-14. The dataset consists of 5,301 subjects generating 81,330 labeled audio clips in Hindi, Marathi and English. These labels represent expert opinions on the child’s ability to read at a specified level. Using this dataset, we built a simple ASR-based classifier. Early results indicate that we can achieve a prediction accuracy of 86% for the English language. Considering the ASER survey spans half a million subjects, this dataset can grow to those scales.
Tasks
Published 2019-11-27
URL https://arxiv.org/abs/1912.04381v2
PDF https://arxiv.org/pdf/1912.04381v2.pdf
PWC https://paperswithcode.com/paper/a-dataset-for-measuring-reading-levels-in
Repo
Framework

Transform the Set: Memory Attentive Generation of Guided and Unguided Image Collages

Title Transform the Set: Memory Attentive Generation of Guided and Unguided Image Collages
Authors Nikolay Jetchev, Urs Bergmann, Gökhan Yildirim
Abstract Cutting and pasting image segments feels intuitive: the choice of source templates gives artists flexibility in recombining existing source material. Formally, this process takes an image set as input and outputs a collage of the set elements. Such selection from sets of source templates does not fit easily in classical convolutional neural models requiring inputs of fixed size. Inspired by advances in attention and set-input machine learning, we present a novel architecture that can generate in one forward pass image collages of source templates using set-structured representations. This paper has the following contributions: (i) a novel framework for image generation called Memory Attentive Generation of Image Collages (MAGIC) which gives artists new ways to create digital collages; (ii) from the machine-learning perspective, we show a novel Generative Adversarial Networks (GAN) architecture that uses Set-Transformer layers and set-pooling to blend sets of random image samples - a hybrid non-parametric approach.
Tasks Image Generation
Published 2019-10-16
URL https://arxiv.org/abs/1910.07236v2
PDF https://arxiv.org/pdf/1910.07236v2.pdf
PWC https://paperswithcode.com/paper/transform-the-set-memory-attentive-generation
Repo
Framework
comments powered by Disqus