April 2, 2020

3188 words 15 mins read

Paper Group ANR 130

A Benchmark for LiDAR-based Panoptic Segmentation based on KITTI. Deep Learning for MIR Tutorial. Cost-aware Bayesian Optimization. A Bounded Measure for Estimating the Benefit of Visualization. Dialectal Layers in West Iranian: a Hierarchical Dirichlet Process Approach to Linguistic Relationships. Learning Shape Representations for Clothing Variat …

A Benchmark for LiDAR-based Panoptic Segmentation based on KITTI


Title	A Benchmark for LiDAR-based Panoptic Segmentation based on KITTI
Authors	Jens Behley, Andres Milioto, Cyrill Stachniss
Abstract	Panoptic segmentation is the recently introduced task that tackles semantic segmentation and instance segmentation jointly. In this paper, we present an extension of SemanticKITTI, which is a large-scale dataset providing dense point-wise semantic labels for all sequences of the KITTI Odometry Benchmark, for training and evaluation of laser-based panoptic segmentation. We provide the data and discuss the processing steps needed to enrich a given semantic annotation with temporally consistent instance information, i.e., instance information that supplements the semantic labels and identifies the same instance over sequences of LiDAR point clouds. Additionally, we present two strong baselines that combine state-of-the-art LiDAR-based semantic segmentation approaches with a state-of-the-art detector enriching the segmentation with instance information and that allow other researchers to compare their approaches against. We hope that our extension of SemanticKITTI with strong baselines enables the creation of novel algorithms for LiDAR-based panoptic segmentation as much as it has for the original semantic segmentation and semantic scene completion tasks. Data, code, and an online evaluation using a hidden test set will be published on http://semantic-kitti.org.
Tasks	Instance Segmentation, Panoptic Segmentation, Semantic Segmentation
Published	2020-03-04
URL	https://arxiv.org/abs/2003.02371v1
PDF	https://arxiv.org/pdf/2003.02371v1.pdf
PWC	https://paperswithcode.com/paper/a-benchmark-for-lidar-based-panoptic
Repo
Framework

Deep Learning for MIR Tutorial


Title	Deep Learning for MIR Tutorial
Authors	Alexander Schindler, Thomas Lidy, Sebastian Böck
Abstract	Deep Learning has become state of the art in visual computing and continuously emerges into the Music Information Retrieval (MIR) and audio retrieval domain. In order to bring attention to this topic we propose an introductory tutorial on deep learning for MIR. Besides a general introduction to neural networks, the proposed tutorial covers a wide range of MIR relevant deep learning approaches. \textbf{Convolutional Neural Networks} are currently a de-facto standard for deep learning based audio retrieval. \textbf{Recurrent Neural Networks} have proven to be effective in onset detection tasks such as beat or audio-event detection. \textbf{Siamese Networks} have been shown effective in learning audio representations and distance functions specific for music similarity retrieval. We will incorporate both academic and industrial points of view into the tutorial. Accompanying the tutorial, we will create a Github repository for the content presented at the tutorial as well as references to state of the art work and literature for further reading. This repository will remain public after the conference.
Tasks	Information Retrieval, Music Information Retrieval
Published	2020-01-15
URL	https://arxiv.org/abs/2001.05266v1
PDF	https://arxiv.org/pdf/2001.05266v1.pdf
PWC	https://paperswithcode.com/paper/deep-learning-for-mir-tutorial
Repo
Framework

Cost-aware Bayesian Optimization


Title	Cost-aware Bayesian Optimization
Authors	Eric Hans Lee, Valerio Perrone, Cedric Archambeau, Matthias Seeger
Abstract	Bayesian optimization (BO) is a class of global optimization algorithms, suitable for minimizing an expensive objective function in as few function evaluations as possible. While BO budgets are typically given in iterations, this implicitly measures convergence in terms of iteration count and assumes each evaluation has identical cost. In practice, evaluation costs may vary in different regions of the search space. For example, the cost of neural network training increases quadratically with layer size, which is a typical hyperparameter. Cost-aware BO measures convergence with alternative cost metrics such as time, energy, or money, for which vanilla BO methods are unsuited. We introduce Cost Apportioned BO (CArBO), which attempts to minimize an objective function in as little cost as possible. CArBO combines a cost-effective initial design with a cost-cooled optimization phase which depreciates a learned cost model as iterations proceed. On a set of 20 black-box function optimization problems we show that, given the same cost budget, CArBO finds significantly better hyperparameter configurations than competing methods.
Tasks
Published	2020-03-22
URL	https://arxiv.org/abs/2003.10870v1
PDF	https://arxiv.org/pdf/2003.10870v1.pdf
PWC	https://paperswithcode.com/paper/cost-aware-bayesian-optimization
Repo
Framework

A Bounded Measure for Estimating the Benefit of Visualization


Title	A Bounded Measure for Estimating the Benefit of Visualization
Authors	Min Chen, Mateu Sbert, Alfie Abdul-Rahman, Deborah Silver
Abstract	Information theory can be used to analyze the cost-benefit of visualization processes. However, the current measure of benefit contains an unbounded term that is neither easy to estimate nor intuitive to interpret. In this work, we propose to revise the existing cost-benefit measure by replacing the unbounded term with a bounded one. We examine a number of bounded measures that include the Jenson-Shannon divergence and a new divergence measure formulated as part of this work. We use visual analysis to support the multi-criteria comparison, enabling the selection of the most logical and intuitive option. We applied the revised cost-benefit measure to two case studies, demonstrating its uses in practical scenarios, while the collected real world data further informs the selection of a bounded measure.
Tasks
Published	2020-02-12
URL	https://arxiv.org/abs/2002.05282v1
PDF	https://arxiv.org/pdf/2002.05282v1.pdf
PWC	https://paperswithcode.com/paper/a-bounded-measure-for-estimating-the-benefit
Repo
Framework

Dialectal Layers in West Iranian: a Hierarchical Dirichlet Process Approach to Linguistic Relationships


Title	Dialectal Layers in West Iranian: a Hierarchical Dirichlet Process Approach to Linguistic Relationships
Authors	Chundra Aroor Cathcart
Abstract	This paper addresses a series of complex and unresolved issues in the historical phonology of West Iranian languages. The West Iranian languages (Persian, Kurdish, Balochi, and other languages) display a high degree of non-Lautgesetzlich behavior. Most of this irregularity is undoubtedly due to language contact; we argue, however, that an oversimplified view of the processes at work has prevailed in the literature on West Iranian dialectology, with specialists assuming that deviations from an expected outcome in a given non-Persian language are due to lexical borrowing from some chronological stage of Persian. It is demonstrated that this qualitative approach yields at times problematic conclusions stemming from the lack of explicit probabilistic inferences regarding the distribution of the data: Persian may not be the sole donor language; additionally, borrowing at the lexical level is not always the mechanism that introduces irregularity. In many cases, the possibility that West Iranian languages show different reflexes in different conditioning environments remains under-explored. We employ a novel Bayesian approach designed to overcome these problems and tease apart the different determinants of irregularity in patterns of West Iranian sound change. Our methodology allows us to provisionally resolve a number of outstanding questions in the literature on West Iranian dialectology concerning the dialectal affiliation of certain sound changes. We outline future directions for work of this sort.
Tasks
Published	2020-01-13
URL	https://arxiv.org/abs/2001.05297v1
PDF	https://arxiv.org/pdf/2001.05297v1.pdf
PWC	https://paperswithcode.com/paper/dialectal-layers-in-west-iranian-a
Repo
Framework

Learning Shape Representations for Clothing Variations in Person Re-Identification


Title	Learning Shape Representations for Clothing Variations in Person Re-Identification
Authors	Yu-Jhe Li, Zhengyi Luo, Xinshuo Weng, Kris M. Kitani
Abstract	Person re-identification (re-ID) aims to recognize instances of the same person contained in multiple images taken across different cameras. Existing methods for re-ID tend to rely heavily on the assumption that both query and gallery images of the same person have the same clothing. Unfortunately, this assumption may not hold for datasets captured over long periods of time (e.g., weeks, months or years). To tackle the re-ID problem in the context of clothing changes, we propose a novel representation learning model which is able to generate a body shape feature representation without being affected by clothing color or patterns. We call our model the Color Agnostic Shape Extraction Network (CASE-Net). CASE-Net learns a representation of identity that depends only on body shape via adversarial learning and feature disentanglement. Due to the lack of large-scale re-ID datasets which contain clothing changes for the same person, we propose two synthetic datasets for evaluation. We create a rendered dataset SMPL-reID with different clothes patterns and a synthesized dataset Div-Market with different clothing color to simulate two types of clothing changes. The quantitative and qualitative results across 5 datasets (SMPL-reID, Div-Market, two benchmark re-ID datasets, a cross-modality re-ID dataset) confirm the robustness and superiority of our approach against several state-of-the-art approaches
Tasks	Person Re-Identification, Representation Learning
Published	2020-03-16
URL	https://arxiv.org/abs/2003.07340v1
PDF	https://arxiv.org/pdf/2003.07340v1.pdf
PWC	https://paperswithcode.com/paper/learning-shape-representations-for-clothing
Repo
Framework

Certification of Semantic Perturbations via Randomized Smoothing


Title	Certification of Semantic Perturbations via Randomized Smoothing
Authors	Marc Fischer, Maximilian Baader, Martin Vechev
Abstract	We introduce a novel certification method for parametrized perturbations by generalizing randomized smoothing. Using this method, we construct a provable classifier that can establish state-of-the-art robustness against semantic perturbations including geometric transformations (e.g., rotation, translation), for different types of interpolation, and, for the first time, volume changes on audio data. Our experimental results indicate that the method is practically effective: for ResNet-50 on ImageNet, it achieves rotational robustness provable up to $\pm 30^\circ$ for 28% of images.
Tasks
Published	2020-02-27
URL	https://arxiv.org/abs/2002.12463v1
PDF	https://arxiv.org/pdf/2002.12463v1.pdf
PWC	https://paperswithcode.com/paper/certification-of-semantic-perturbations-via
Repo
Framework

Symmetry & critical points for a model shallow neural network


Title	Symmetry & critical points for a model shallow neural network
Authors	Yossi Arjevani, Michael Field
Abstract	A detailed analysis is given of a family of critical points determining spurious minima for a model student-teacher 2-layer neural network, with ReLU activation function, and a natural $\Gamma = S_k \times S_k$-symmetry. For a $k$-neuron shallow network of this type, analytic equations are given which, for example, determine the critical points of the spurious minima described by Safran and Shamir (2018) for $6 \le k \le 20$. These critical points have isotropy (conjugate to) the diagonal subgroup $\Delta S_{k-1}\subset \Delta S_k$ of $\Gamma$. It is shown that critical points of this family can be expressed as an infinite series in $1/\sqrt{k}$ (for large enough $k$) and, as an application, the critical values decay like $a k^{-1}$, where $a \approx 0.3$. Other non-trivial families of critical points are also described with isotropy conjugate to $\Delta S_{k-1}, \Delta S_k$ and $\Delta (S_2\times S_{k-2})$ (the latter giving spurious minima for $k\ge 9$). The methods used depend on symmetry breaking, bifurcation, and algebraic geometry, notably Artin’s implicit function theorem, and are applicable to other families of critical points that occur in this network.
Tasks
Published	2020-03-23
URL	https://arxiv.org/abs/2003.10576v1
PDF	https://arxiv.org/pdf/2003.10576v1.pdf
PWC	https://paperswithcode.com/paper/symmetry-critical-points-for-a-model-shallow
Repo
Framework

Reconfigurable Intelligent Surface Assisted Multiuser MISO Systems Exploiting Deep Reinforcement Learning


Title	Reconfigurable Intelligent Surface Assisted Multiuser MISO Systems Exploiting Deep Reinforcement Learning
Authors	Chongwen Huang, Member, IEEE, Ronghong Mo, Chau Yuen, Senior Member
Abstract	Recently, the reconfigurable intelligent surface (RIS), benefited from the breakthrough on the fabrication of programmable meta-material, has been speculated as one of the key enabling technologies for the future six generation (6G) wireless communication systems scaled up beyond massive multiple input multiple output (Massive-MIMO) technology to achieve smart radio environments. Employed as reflecting arrays, RIS is able to assist MIMO transmissions without the need of radio frequency chains resulting in considerable reduction in power consumption. In this paper, we investigate the joint design of transmit beamforming matrix at the base station and the phase shift matrix at the RIS, by leveraging recent advances in deep reinforcement learning (DRL). We first develop a DRL based algorithm, in which the joint design is obtained through trial-and-error interactions with the environment by observing predefined rewards, in the context of continuous state and action. Unlike the most reported works utilizing the alternating optimization techniques to alternatively obtain the transmit beamforming and phase shifts, the proposed DRL based algorithm obtains the joint design simultaneously as the output of the DRL neural network. Simulation results show that the proposed algorithm is not only able to learn from the environment and gradually improve its behavior, but also obtains the comparable performance compared with two state-of-the-art benchmarks. It is also observed that, appropriate neural network parameter settings will improve significantly the performance and convergence rate of the proposed algorithm.
Tasks
Published	2020-02-24
URL	https://arxiv.org/abs/2002.10072v1
PDF	https://arxiv.org/pdf/2002.10072v1.pdf
PWC	https://paperswithcode.com/paper/reconfigurable-intelligent-surface-assisted
Repo
Framework

Annotation of Emotion Carriers in Personal Narratives


Title	Annotation of Emotion Carriers in Personal Narratives
Authors	Aniruddha Tammewar, Alessandra Cervone, Eva-Maria Messner, Giuseppe Riccardi
Abstract	We are interested in the problem of understanding personal narratives (PN) - spoken or written - recollections of facts, events, and thoughts. In PN, emotion carriers are the speech or text segments that best explain the emotional state of the user. Such segments may include entities, verb or noun phrases. Advanced automatic understanding of PNs requires not only the prediction of the user emotional state but also to identify which events (e.g. “the loss of relative” or “the visit of grandpa”) or people ( e.g. “the old group of high school mates”) carry the emotion manifested during the personal recollection. This work proposes and evaluates an annotation model for identifying emotion carriers in spoken personal narratives. Compared to other text genres such as news and microblogs, spoken PNs are particularly challenging because a narrative is usually unstructured, involving multiple sub-events and characters as well as thoughts and associated emotions perceived by the narrator. In this work, we experiment with annotating emotion carriers from speech transcriptions in the Ulm State-of-Mind in Speech (USoMS) corpus, a dataset of German PNs. We believe this resource could be used for experiments in the automatic extraction of emotion carriers from PN, a task that could provide further advancements in narrative understanding.
Tasks
Published	2020-02-27
URL	https://arxiv.org/abs/2002.12196v2
PDF	https://arxiv.org/pdf/2002.12196v2.pdf
PWC	https://paperswithcode.com/paper/annotation-of-emotion-carriers-in-personal
Repo
Framework

A Set-Theoretic Study of the Relationships of Image Models and Priors for Restoration Problems


Title	A Set-Theoretic Study of the Relationships of Image Models and Priors for Restoration Problems
Authors	Bihan Wen, Yanjun Li, Yuqi Li, Yoram Bresler
Abstract	Image prior modeling is the key issue in image recovery, computational imaging, compresses sensing, and other inverse problems. Recent algorithms combining multiple effective priors such as the sparse or low-rank models, have demonstrated superior performance in various applications. However, the relationships among the popular image models are unclear, and no theory in general is available to demonstrate their connections. In this paper, we present a theoretical analysis on the image models, to bridge the gap between applications and image prior understanding, including sparsity, group-wise sparsity, joint sparsity, and low-rankness, etc. We systematically study how effective each image model is for image restoration. Furthermore, we relate the denoising performance improvement by combining multiple models, to the image model relationships. Extensive experiments are conducted to compare the denoising results which are consistent with our analysis. On top of the model-based methods, we quantitatively demonstrate the image properties that are inexplicitly exploited by deep learning method, of which can further boost the denoising performance by combining with its complementary image models.
Tasks	Denoising, Image Restoration
Published	2020-03-29
URL	https://arxiv.org/abs/2003.12985v1
PDF	https://arxiv.org/pdf/2003.12985v1.pdf
PWC	https://paperswithcode.com/paper/a-set-theoretic-study-of-the-relationships-of
Repo
Framework

Top Comment or Flop Comment? Predicting and Explaining User Engagement in Online News Discussions


Title	Top Comment or Flop Comment? Predicting and Explaining User Engagement in Online News Discussions
Authors	Julian Risch, Ralf Krestel
Abstract	Comment sections below online news articles enjoy growing popularity among readers. However, the overwhelming number of comments makes it infeasible for the average news consumer to read all of them and hinders engaging discussions. Most platforms display comments in chronological order, which neglects that some of them are more relevant to users and are better conversation starters. In this paper, we systematically analyze user engagement in the form of the upvotes and replies that a comment receives. Based on comment texts, we train a model to distinguish comments that have either a high or low chance of receiving many upvotes and replies. Our evaluation on user comments from TheGuardian.com compares recurrent and convolutional neural network models, and a traditional feature-based classifier. Further, we investigate what makes some comments more engaging than others. To this end, we identify engagement triggers and arrange them in a taxonomy. Explanation methods for neural networks reveal which input words have the strongest influence on our model’s predictions. In addition, we evaluate on a dataset of product reviews, which exhibit similar properties as user comments, such as featuring upvotes for helpfulness.
Tasks
Published	2020-03-26
URL	https://arxiv.org/abs/2003.11949v1
PDF	https://arxiv.org/pdf/2003.11949v1.pdf
PWC	https://paperswithcode.com/paper/top-comment-or-flop-comment-predicting-and
Repo
Framework

Modeling Product Search Relevance in e-Commerce


Title	Modeling Product Search Relevance in e-Commerce
Authors	Rahul Radhakrishnan Iyer, Rohan Kohli, Shrimai Prabhumoye
Abstract	With the rapid growth of e-Commerce, online product search has emerged as a popular and effective paradigm for customers to find desired products and engage in online shopping. However, there is still a big gap between the products that customers really desire to purchase and relevance of products that are suggested in response to a query from the customer. In this paper, we propose a robust way of predicting relevance scores given a search query and a product, using techniques involving machine learning, natural language processing and information retrieval. We compare conventional information retrieval models such as BM25 and Indri with deep learning models such as word2vec, sentence2vec and paragraph2vec. We share some of our insights and findings from our experiments.
Tasks	Information Retrieval
Published	2020-01-14
URL	https://arxiv.org/abs/2001.04980v1
PDF	https://arxiv.org/pdf/2001.04980v1.pdf
PWC	https://paperswithcode.com/paper/modeling-product-search-relevance-in-e
Repo
Framework

Foundations of Explainable Knowledge-Enabled Systems


Title	Foundations of Explainable Knowledge-Enabled Systems
Authors	Shruthi Chari, Daniel M. Gruen, Oshani Seneviratne, Deborah L. McGuinness
Abstract	Explainability has been an important goal since the early days of Artificial Intelligence. Several approaches for producing explanations have been developed. However, many of these approaches were tightly coupled with the capabilities of the artificial intelligence systems at the time. With the proliferation of AI-enabled systems in sometimes critical settings, there is a need for them to be explainable to end-users and decision-makers. We present a historical overview of explainable artificial intelligence systems, with a focus on knowledge-enabled systems, spanning the expert systems, cognitive assistants, semantic applications, and machine learning domains. Additionally, borrowing from the strengths of past approaches and identifying gaps needed to make explanations user- and context-focused, we propose new definitions for explanations and explainable knowledge-enabled systems.
Tasks
Published	2020-03-17
URL	https://arxiv.org/abs/2003.07520v1
PDF	https://arxiv.org/pdf/2003.07520v1.pdf
PWC	https://paperswithcode.com/paper/foundations-of-explainable-knowledge-enabled
Repo
Framework

Lattice-based Improvements for Voice Triggering Using Graph Neural Networks


Title	Lattice-based Improvements for Voice Triggering Using Graph Neural Networks
Authors	Pranay Dighe, Saurabh Adya, Nuoyu Li, Srikanth Vishnubhotla, Devang Naik, Adithya Sagar, Ying Ma, Stephen Pulman, Jason Williams
Abstract	Voice-triggered smart assistants often rely on detection of a trigger-phrase before they start listening for the user request. Mitigation of false triggers is an important aspect of building a privacy-centric non-intrusive smart assistant. In this paper, we address the task of false trigger mitigation (FTM) using a novel approach based on analyzing automatic speech recognition (ASR) lattices using graph neural networks (GNN). The proposed approach uses the fact that decoding lattice of a falsely triggered audio exhibits uncertainties in terms of many alternative paths and unexpected words on the lattice arcs as compared to the lattice of a correctly triggered audio. A pure trigger-phrase detector model doesn’t fully utilize the intent of the user speech whereas by using the complete decoding lattice of user audio, we can effectively mitigate speech not intended for the smart assistant. We deploy two variants of GNNs in this paper based on 1) graph convolution layers and 2) self-attention mechanism respectively. Our experiments demonstrate that GNNs are highly accurate in FTM task by mitigating ~87% of false triggers at 99% true positive rate (TPR). Furthermore, the proposed models are fast to train and efficient in parameter requirements.
Tasks	Speech Recognition
Published	2020-01-25
URL	https://arxiv.org/abs/2001.10822v1
PDF	https://arxiv.org/pdf/2001.10822v1.pdf
PWC	https://paperswithcode.com/paper/lattice-based-improvements-for-voice
Repo
Framework