April 1, 2020

# Paper Group ANR 491

A Free-Energy Principle for Representation Learning. Video Monitoring Queries. On the Inductive Bias of a CNN for Orthogonal Patterns Distributions. Cross-dataset Training for Class Increasing Object Detection. Explicit Regularization of Stochastic Gradient Methods through Duality. Model-Based Machine Learning for Joint Digital Backpropagation and …

#### A Free-Energy Principle for Representation Learning

Title A Free-Energy Principle for Representation Learning
Authors Yansong Gao, Pratik Chaudhari
Abstract This paper employs a formal connection of machine learning with thermodynamics to characterize the quality of learnt representations for transfer learning. We discuss how information-theoretic functional such as rate, distortion and classification loss of a model lie on a convex, so-called equilibrium surface.We prescribe dynamical processes to traverse this surface under constraints, e.g., an iso-classification process that trades off rate and distortion to keep the classification loss unchanged. We demonstrate how this process can be used for transferring representations from a source dataset to a target dataset while keeping the classification loss constant. Experimental validation of the theoretical results is provided on standard image-classification datasets.
Tasks Image Classification, Representation Learning, Transfer Learning
Published 2020-02-27
URL https://arxiv.org/abs/2002.12406v1
PDF https://arxiv.org/pdf/2002.12406v1.pdf
PWC https://paperswithcode.com/paper/a-free-energy-principle-for-representation
Repo
Framework

#### Video Monitoring Queries

Title Video Monitoring Queries
Authors Nick Koudas, Raymond Li, Ioannis Xarchakos
Abstract Recent advances in video processing utilizing deep learning primitives achieved breakthroughs in fundamental problems in video analysis such as frame classification and object detection enabling an array of new applications. In this paper we study the problem of interactive declarative query processing on video streams. In particular we introduce a set of approximate filters to speed up queries that involve objects of specific type (e.g., cars, trucks, etc.) on video frames with associated spatial relationships among them (e.g., car left of truck). The resulting filters are able to assess quickly if the query predicates are true to proceed with further analysis of the frame or otherwise not consider the frame further avoiding costly object detection operations. We propose two classes of filters $IC$ and $OD$, that adapt principles from deep image classification and object detection. The filters utilize extensible deep neural architectures and are easy to deploy and utilize. In addition, we propose statistical query processing techniques to process aggregate queries involving objects with spatial constraints on video streams and demonstrate experimentally the resulting increased accuracy on the resulting aggregate estimation. Combined these techniques constitute a robust set of video monitoring query processing techniques. We demonstrate that the application of the techniques proposed in conjunction with declarative queries on video streams can dramatically increase the frame processing rate and speed up query processing by at least two orders of magnitude. We present the results of a thorough experimental study utilizing benchmark video data sets at scale demonstrating the performance benefits and the practical relevance of our proposals.
Published 2020-02-24
URL https://arxiv.org/abs/2002.10537v1
PDF https://arxiv.org/pdf/2002.10537v1.pdf
PWC https://paperswithcode.com/paper/video-monitoring-queries
Repo
Framework

#### On the Inductive Bias of a CNN for Orthogonal Patterns Distributions

Title On the Inductive Bias of a CNN for Orthogonal Patterns Distributions
Authors Alon Brutzkus, Amir Globerson
Abstract Training overparameterized convolutional neural networks with gradient based methods is the most successful learning method for image classification. However, its theoretical properties are far from understood even for very simple learning tasks. In this work, we consider a simplified image classification task where images contain orthogonal patches and are learned with a 3-layer overparameterized convolutional network and stochastic gradient descent. We empirically identify a novel phenomenon where the dot-product between the learned pattern detectors and their detected patterns are governed by the pattern statistics in the training set. We call this phenomenon Pattern Statistics Inductive Bias (PSI) and prove that PSI holds for a simple setup with two points in the training set. Furthermore, we prove that if PSI holds, stochastic gradient descent has sample complexity $O(d^2\log(d))$ where $d$ is the filter dimension. In contrast, we show a VC dimension lower bound in our setting which is exponential in $d$. Taken together, our results provide strong evidence that PSI is a unique inductive bias of stochastic gradient descent, that guarantees good generalization properties.
Published 2020-02-22
URL https://arxiv.org/abs/2002.09781v1
PDF https://arxiv.org/pdf/2002.09781v1.pdf
PWC https://paperswithcode.com/paper/on-the-inductive-bias-of-a-cnn-for-orthogonal
Repo
Framework

#### Cross-dataset Training for Class Increasing Object Detection

Title Cross-dataset Training for Class Increasing Object Detection
Authors Yongqiang Yao, Yan Wang, Yu Guo, Jiaojiao Lin, Hongwei Qin, Junjie Yan
Abstract We present a conceptually simple, flexible and general framework for cross-dataset training in object detection. Given two or more already labeled datasets that target for different object classes, cross-dataset training aims to detect the union of the different classes, so that we do not have to label all the classes for all the datasets. By cross-dataset training, existing datasets can be utilized to detect the merged object classes with a single model. Further more, in industrial applications, the object classes usually increase on demand. So when adding new classes, it is quite time-consuming if we label the new classes on all the existing datasets. While using cross-dataset training, we only need to label the new classes on the new dataset. We experiment on PASCAL VOC, COCO, WIDER FACE and WIDER Pedestrian with both solo and cross-dataset settings. Results show that our cross-dataset pipeline can achieve similar impressive performance simultaneously on these datasets compared with training independently.
Published 2020-01-14
URL https://arxiv.org/abs/2001.04621v1
PDF https://arxiv.org/pdf/2001.04621v1.pdf
PWC https://paperswithcode.com/paper/cross-dataset-training-for-class-increasing
Repo
Framework

#### Explicit Regularization of Stochastic Gradient Methods through Duality

Title Explicit Regularization of Stochastic Gradient Methods through Duality
Authors Anant Raj, Francis Bach
Abstract We consider stochastic gradient methods under the interpolation regime where a perfect fit can be obtained (minimum loss at each observation). While previous work highlighted the implicit regularization of such algorithms, we consider an explicit regularization framework as a minimum Bregman divergence convex feasibility problem. Using convex duality, we propose randomized Dykstra-style algorithms based on randomized dual coordinate ascent. For non-accelerated coordinate descent, we obtain an algorithm which bears strong similarities with (non-averaged) stochastic mirror descent on specific functions, as it is is equivalent for quadratic objectives, and equivalent in the early iterations for more general objectives. It comes with the benefit of an explicit convergence theorem to a minimum norm solution. For accelerated coordinate descent, we obtain a new algorithm that has better convergence properties than existing stochastic gradient methods in the interpolating regime. This leads to accelerated versions of the perceptron for generic $\ell_p$-norm regularizers, which we illustrate in experiments.
Published 2020-03-30
URL https://arxiv.org/abs/2003.13807v1
PDF https://arxiv.org/pdf/2003.13807v1.pdf
PWC https://paperswithcode.com/paper/explicit-regularization-of-stochastic
Repo
Framework

#### Model-Based Machine Learning for Joint Digital Backpropagation and PMD Compensation

Title Model-Based Machine Learning for Joint Digital Backpropagation and PMD Compensation
Authors Christian Häger, Henry D. Pfister, Rick M. Bütler, Gabriele Liga, Alex Alvarado
Abstract We propose a model-based machine-learning approach for polarization-multiplexed systems by parameterizing the split-step method for the Manakov-PMD equation. This approach performs hardware-friendly DBP and distributed PMD compensation with performance close to the PMD-free case.
Published 2020-01-25
URL https://arxiv.org/abs/2001.09277v1
PDF https://arxiv.org/pdf/2001.09277v1.pdf
PWC https://paperswithcode.com/paper/model-based-machine-learning-for-joint
Repo
Framework

#### Causal Feature Discovery through Strategic Modification

Title Causal Feature Discovery through Strategic Modification
Authors Yahav Bechavod, Katrina Ligett, Zhiwei Steven Wu, Juba Ziani
Abstract We consider an online regression setting in which individuals adapt to the regression model: arriving individuals may access the model throughout the process, and invest strategically in modifying their own features so as to improve their assigned score. We find that this strategic manipulation may help a learner recover the causal variables, in settings where an agent can invest in improving impactful features that also improve his true label. We show that even simple behavior on the learner’s part (i.e., periodically updating her model based on the observed data so far, via least-square regression) allows her to simultaneously i) accurately recover which features have an impact on an agent’s true label, provided they have been invested in significantly, and ii) incentivize agents to invest in these impactful features, rather than in features that have no effect on their true label.
Published 2020-02-17
URL https://arxiv.org/abs/2002.07024v1
PDF https://arxiv.org/pdf/2002.07024v1.pdf
PWC https://paperswithcode.com/paper/causal-feature-discovery-through-strategic
Repo
Framework

#### Regularizing Semi-supervised Graph Convolutional Networks with a Manifold Smoothness Loss

Title Regularizing Semi-supervised Graph Convolutional Networks with a Manifold Smoothness Loss
Authors Qilin Li, Wanquan Liu, Ling Li
Abstract Existing graph convolutional networks focus on the neighborhood aggregation scheme. When applied to semi-supervised learning, they often suffer from the overfitting problem as the networks are trained with the cross-entropy loss on a small potion of labeled data. In this paper, we propose an unsupervised manifold smoothness loss defined with respect to the graph structure, which can be added to the loss function as a regularization. We draw connections between the proposed loss with an iterative diffusion process, and show that minimizing the loss is equivalent to aggregate neighbor predictions with infinite layers. We conduct experiments on multi-layer perceptron and existing graph networks, and demonstrate that adding the proposed loss can improve the performance consistently.
Published 2020-02-11
URL https://arxiv.org/abs/2002.07031v1
PDF https://arxiv.org/pdf/2002.07031v1.pdf
PWC https://paperswithcode.com/paper/regularizing-semi-supervised-graph
Repo
Framework

#### Modeling Musical Structure with Artificial Neural Networks

Title Modeling Musical Structure with Artificial Neural Networks
Authors Stefan Lattner
Abstract In recent years, artificial neural networks (ANNs) have become a universal tool for tackling real-world problems. ANNs have also shown great success in music-related tasks including music summarization and classification, similarity estimation, computer-aided or autonomous composition, and automatic music analysis. As structure is a fundamental characteristic of Western music, it plays a role in all these tasks. Some structural aspects are particularly challenging to learn with current ANN architectures. This is especially true for mid- and high-level self-similarity, tonal and rhythmic relationships. In this thesis, I explore the application of ANNs to different aspects of musical structure modeling, identify some challenges involved and propose strategies to address them. First, using probability estimations of a Restricted Boltzmann Machine (RBM), a probabilistic bottom-up approach to melody segmentation is studied. Then, a top-down method for imposing a high-level structural template in music generation is presented, which combines Gibbs sampling using a convolutional RBM with gradient-descent optimization on the intermediate solutions. Furthermore, I motivate the relevance of musical transformations in structure modeling and show how a connectionist model, the Gated Autoencoder (GAE), can be employed to learn transformations between musical fragments. For learning transformations in sequences, I propose a special predictive training of the GAE, which yields a representation of polyphonic music as a sequence of intervals. Furthermore, the applicability of these interval representations to a top-down discovery of repeated musical sections is shown. Finally, a recurrent variant of the GAE is proposed, and its efficacy in music prediction and modeling of low-level repetition structure is demonstrated.
Published 2020-01-06
URL https://arxiv.org/abs/2001.01720v1
PDF https://arxiv.org/pdf/2001.01720v1.pdf
PWC https://paperswithcode.com/paper/modeling-musical-structure-with-artificial
Repo
Framework

#### Option Discovery in the Absence of Rewards with Manifold Analysis

Title Option Discovery in the Absence of Rewards with Manifold Analysis
Authors Amitay Bar, Ronen Talmon, Ron Meir
Abstract Options have been shown to be an effective tool in reinforcement learning, facilitating improved exploration and learning. In this paper, we present an approach based on spectral graph theory and derive an algorithm that systematically discovers options without access to a specific reward or task assignment. As opposed to the common practice used in previous methods, our algorithm makes full use of the spectrum of the graph Laplacian. Incorporating modes associated with higher graph frequencies unravels domain subtleties, which are shown to be useful for option discovery. Using geometric and manifold-based analysis, we present a theoretical justification for the algorithm. In addition, we showcase its performance in several domains, demonstrating clear improvements compared to competing methods.
Published 2020-03-12
URL https://arxiv.org/abs/2003.05878v1
PDF https://arxiv.org/pdf/2003.05878v1.pdf
PWC https://paperswithcode.com/paper/option-discovery-in-the-absence-of-rewards
Repo
Framework

#### Robust Marine Buoy Placement for Ship Detection Using Dropout K-Means

Title Robust Marine Buoy Placement for Ship Detection Using Dropout K-Means
Authors Yuting Ng, João M. Pereira, Denis Garagic, Vahid Tarokh
Abstract Marine buoys aid in the battle against Illegal, Unreported and Unregulated (IUU) fishing by detecting fishing vessels in their vicinity. Marine buoys, however, may be disrupted by natural causes and buoy vandalism. In this paper, we formulate marine buoy placement as a clustering problem, and propose dropout k-means and dropout k-median to improve placement robustness to buoy disruption. We simulated the passage of ships in the Gabonese waters near West Africa using historical Automatic Identification System (AIS) data, then compared the ship detection probability of dropout k-means to classic k-means and dropout k-median to classic k-median. With 5 buoys, the buoy arrangement computed by classic k-means, dropout k-means, classic k-median and dropout k-median have ship detection probabilities of 38%, 45%, 48% and 52%.
Published 2020-01-02
URL https://arxiv.org/abs/2001.00564v2
PDF https://arxiv.org/pdf/2001.00564v2.pdf
PWC https://paperswithcode.com/paper/robust-marine-buoy-placement-for-ship
Repo
Framework

#### Predicting Bank Loan Default with Extreme Gradient Boosting

Title Predicting Bank Loan Default with Extreme Gradient Boosting
Authors Rising Odegua
Abstract Loan default prediction is one of the most important and critical problems faced by banks and other financial institutions as it has a huge effect on profit. Although many traditional methods exist for mining information about a loan application, most of these methods seem to be under-performing as there have been reported increases in the number of bad loans. In this paper, we use an Extreme Gradient Boosting algorithm called XGBoost for loan default prediction. The prediction is based on a loan data from a leading bank taking into consideration data sets from both the loan application and the demographic of the applicant. We also present important evaluation metrics such as Accuracy, Recall, precision, F1-Score and ROC area of the analysis. This paper provides an effective basis for loan credit approval in order to identify risky customers from a large number of loan applications using predictive modeling.
Published 2020-01-18
URL https://arxiv.org/abs/2002.02011v1
PDF https://arxiv.org/pdf/2002.02011v1.pdf
PWC https://paperswithcode.com/paper/predicting-bank-loan-default-with-extreme
Repo
Framework

#### Audio Summarization with Audio Features and Probability Distribution Divergence

Title Audio Summarization with Audio Features and Probability Distribution Divergence
Authors Carlos-Emiliano González-Gallardo, Romain Deveaud, Eric SanJuan, Juan-Manuel Torres
Abstract The automatic summarization of multimedia sources is an important task that facilitates the understanding of an individual by condensing the source while maintaining relevant information. In this paper we focus on audio summarization based on audio features and the probability of distribution divergence. Our method, based on an extractive summarization approach, aims to select the most relevant segments until a time threshold is reached. It takes into account the segment’s length, position and informativeness value. Informativeness of each segment is obtained by mapping a set of audio features issued from its Mel-frequency Cepstral Coefficients and their corresponding Jensen-Shannon divergence score. Results over a multi-evaluator scheme shows that our approach provides understandable and informative summaries.
Published 2020-01-20
URL https://arxiv.org/abs/2001.07098v1
PDF https://arxiv.org/pdf/2001.07098v1.pdf
PWC https://paperswithcode.com/paper/audio-summarization-with-audio-features-and
Repo
Framework

#### Towards Learning a Universal Non-Semantic Representation of Speech

Title Towards Learning a Universal Non-Semantic Representation of Speech
Authors Joel Shor, Aren Jansen, Ronnie Maor, Oran Lang, Omry Tuval, Felix de Chaumont Quitry, Marco Tagliasacchi, Ira Shavitt, Dotan Emanuel, Yinnon Haviv
Abstract The ultimate goal of transfer learning is to reduce labeled data requirements by exploiting a pre-existing embedding model trained for different datasets or tasks. While significant progress has been made in the visual and language domains, the speech community has yet to identify a strategy with wide-reaching applicability across tasks. This paper describes a representation of speech based on an unsupervised triplet-loss objective, which exceeds state-of-the-art performance on a number of transfer learning tasks drawn from the non-semantic speech domain. The embedding is trained on a publicly available dataset, and it is tested on a variety of low-resource downstream tasks, including personalization tasks and medical domain. The model will be publicly released.
Published 2020-02-25
URL https://arxiv.org/abs/2002.12764v2
PDF https://arxiv.org/pdf/2002.12764v2.pdf
PWC https://paperswithcode.com/paper/towards-learning-a-universal-non-semantic
Repo
Framework

#### Convex Recovery of Marked Spatio-Temporal Point Processes

Title Convex Recovery of Marked Spatio-Temporal Point Processes
Authors Anatoli Juditsky, Arkadi Nemirovski, Liyan Xie, Yao Xie
Abstract We present a multi-dimensional Bernoulli process model for spatial-temporal discrete event data with categorical marks, where the probability of an event of a specific category in a location may be influenced by past events at this and other locations. The focus is to introduce general forms of influence function which can capture an arbitrary shape of influence from historical events, between locations, and between different categories of events. The general form of influence function differs from the commonly adapted exponential delaying function over time, and more importantly, in our model, we can learn the delayed influence of prior events, which is an aspect seemingly largely ignored in prior literature. Prior knowledge or assumptions on the influence function are incorporated into our framework by allowing general convex constraints on the parameters specifying the influence function. We develop two approaches for recovering these parameters, using the constrained least-square (LS) and maximum likelihood (ML) estimations. We demonstrate the performance of our approach on synthetic examples and illustrate its promise using real data (crime data and novel coronavirus data), in extracting knowledge about the general influences and making predictions.