October 19, 2019

3360 words 16 mins read

Paper Group ANR 272

Paper Group ANR 272

The Complexity of Learning Acyclic Conditional Preference Networks. Estimating Heterogeneous Causal Effects in the Presence of Irregular Assignment Mechanisms. Improving Deep Learning through Automatic Programming. Modality-based Factorization for Multimodal Fusion. Nonparametric Testing under Random Projection. First Impressions: A Survey on Visio …

The Complexity of Learning Acyclic Conditional Preference Networks

Title The Complexity of Learning Acyclic Conditional Preference Networks
Authors Eisa Alanazi, Malek Mouhoub, Sandra Zilles
Abstract Learning of user preferences, as represented by, for example, Conditional Preference Networks (CP-nets), has become a core issue in AI research. Recent studies investigate learning of CP-nets from randomly chosen examples or from membership and equivalence queries. To assess the optimality of learning algorithms as well as to better understand the combinatorial structure of classes of CP-nets, it is helpful to calculate certain learning-theoretic information complexity parameters. This article focuses on the frequently studied case of learning from so-called swap examples, which express preferences among objects that differ in only one attribute. It presents bounds on or exact values of some well-studied information complexity parameters, namely the VC dimension, the teaching dimension, and the recursive teaching dimension, for classes of acyclic CP-nets. We further provide algorithms that learn tree-structured and general acyclic CP-nets from membership queries. Using our results on complexity parameters, we assess the optimality of our algorithms as well as that of another query learning algorithm for acyclic CP-nets presented in the literature. Our algorithms are near-optimal, and can, under certain assumptions, be adapted to the case when the membership oracle is faulty.
Tasks
Published 2018-01-11
URL http://arxiv.org/abs/1801.03968v3
PDF http://arxiv.org/pdf/1801.03968v3.pdf
PWC https://paperswithcode.com/paper/the-complexity-of-learning-acyclic
Repo
Framework

Estimating Heterogeneous Causal Effects in the Presence of Irregular Assignment Mechanisms

Title Estimating Heterogeneous Causal Effects in the Presence of Irregular Assignment Mechanisms
Authors Falco J. Bargagli-Stoffi, Giorgio Gnecco
Abstract This paper provides a link between causal inference and machine learning techniques - specifically, Classification and Regression Trees (CART) - in observational studies where the receipt of the treatment is not randomized, but the assignment to the treatment can be assumed to be randomized (irregular assignment mechanism). The paper contributes to the growing applied machine learning literature on causal inference, by proposing a modified version of the Causal Tree (CT) algorithm to draw causal inference from an irregular assignment mechanism. The proposed method is developed by merging the CT approach with the instrumental variable framework to causal inference, hence the name Causal Tree with Instrumental Variable (CT-IV). As compared to CT, the main strength of CT-IV is that it can deal more efficiently with the heterogeneity of causal effects, as demonstrated by a series of numerical results obtained on synthetic data. Then, the proposed algorithm is used to evaluate a public policy implemented by the Tuscan Regional Administration (Italy), which aimed at easing the access to credit for small firms. In this context, CT-IV breaks fresh ground for target-based policies, identifying interesting heterogeneous causal effects.
Tasks Causal Inference
Published 2018-08-13
URL https://arxiv.org/abs/1808.04281v2
PDF https://arxiv.org/pdf/1808.04281v2.pdf
PWC https://paperswithcode.com/paper/estimating-heterogeneous-causal-effects-in
Repo
Framework

Improving Deep Learning through Automatic Programming

Title Improving Deep Learning through Automatic Programming
Authors The-Hien Dang-Ha
Abstract Deep learning and deep architectures are emerging as the best machine learning methods so far in many practical applications such as reducing the dimensionality of data, image classification, speech recognition or object segmentation. In fact, many leading technology companies such as Google, Microsoft or IBM are researching and using deep architectures in their systems to replace other traditional models. Therefore, improving the performance of these models could make a strong impact in the area of machine learning. However, deep learning is a very fast-growing research domain with many core methodologies and paradigms just discovered over the last few years. This thesis will first serve as a short summary of deep learning, which tries to include all of the most important ideas in this research area. Based on this knowledge, we suggested, and conducted some experiments to investigate the possibility of improving the deep learning based on automatic programming (ADATE). Although our experiments did produce good results, there are still many more possibilities that we could not try due to limited time as well as some limitations of the current ADATE version. I hope that this thesis can promote future work on this topic, especially when the next version of ADATE comes out. This thesis also includes a short analysis of the power of ADATE system, which could be useful for other researchers who want to know what it is capable of.
Tasks Image Classification, Semantic Segmentation, Speech Recognition
Published 2018-07-08
URL http://arxiv.org/abs/1807.02816v1
PDF http://arxiv.org/pdf/1807.02816v1.pdf
PWC https://paperswithcode.com/paper/improving-deep-learning-through-automatic
Repo
Framework

Modality-based Factorization for Multimodal Fusion

Title Modality-based Factorization for Multimodal Fusion
Authors Elham J. Barezi, Pascale Fung
Abstract We propose a novel method, Modality-based Redundancy Reduction Fusion (MRRF), for understanding and modulating the relative contribution of each modality in multimodal inference tasks. This is achieved by obtaining an $(M+1)$-way tensor to consider the high-order relationships between $M$ modalities and the output layer of a neural network model. Applying a modality-based tensor factorization method, which adopts different factors for different modalities, results in removing information present in a modality that can be compensated by other modalities, with respect to model outputs. This helps to understand the relative utility of information in each modality. In addition it leads to a less complicated model with less parameters and therefore could be applied as a regularizer avoiding overfitting. We have applied this method to three different multimodal datasets in sentiment analysis, personality trait recognition, and emotion recognition. We are able to recognize relationships and relative importance of different modalities in these tasks and achieves a 1% to 4% improvement on several evaluation measures compared to the state-of-the-art for all three tasks.
Tasks Emotion Recognition, Personality Trait Recognition, Sentiment Analysis
Published 2018-11-30
URL https://arxiv.org/abs/1811.12624v2
PDF https://arxiv.org/pdf/1811.12624v2.pdf
PWC https://paperswithcode.com/paper/modality-based-factorization-for-multimodal
Repo
Framework

Nonparametric Testing under Random Projection

Title Nonparametric Testing under Random Projection
Authors Meimei Liu, Zuofeng Shang, Guang Cheng
Abstract A common challenge in nonparametric inference is its high computational complexity when data volume is large. In this paper, we develop computationally efficient nonparametric testing by employing a random projection strategy. In the specific kernel ridge regression setup, a simple distance-based test statistic is proposed. Notably, we derive the minimum number of random projections that is sufficient for achieving testing optimality in terms of the minimax rate. An adaptive testing procedure is further established without prior knowledge of regularity. One technical contribution is to establish upper bounds for a range of tail sums of empirical kernel eigenvalues. Simulations and real data analysis are conducted to support our theory.
Tasks
Published 2018-02-17
URL http://arxiv.org/abs/1802.06308v1
PDF http://arxiv.org/pdf/1802.06308v1.pdf
PWC https://paperswithcode.com/paper/nonparametric-testing-under-random-projection
Repo
Framework

First Impressions: A Survey on Vision-Based Apparent Personality Trait Analysis

Title First Impressions: A Survey on Vision-Based Apparent Personality Trait Analysis
Authors Julio C. S. Jacques Junior, Yağmur Güçlütürk, Marc Pérez, Umut Güçlü, Carlos Andujar, Xavier Baró, Hugo Jair Escalante, Isabelle Guyon, Marcel A. J. van Gerven, Rob van Lier, Sergio Escalera
Abstract Personality analysis has been widely studied in psychology, neuropsychology, and signal processing fields, among others. From the past few years, it also became an attractive research area in visual computing. From the computational point of view, by far speech and text have been the most considered cues of information for analyzing personality. However, recently there has been an increasing interest from the computer vision community in analyzing personality from visual data. Recent computer vision approaches are able to accurately analyze human faces, body postures and behaviors, and use these information to infer apparent personality traits. Because of the overwhelming research interest in this topic, and of the potential impact that this sort of methods could have in society, we present in this paper an up-to-date review of existing vision-based approaches for apparent personality trait recognition. We describe seminal and cutting edge works on the subject, discussing and comparing their distinctive features and limitations. Future venues of research in the field are identified and discussed. Furthermore, aspects on the subjectivity in data labeling/evaluation, as well as current datasets and challenges organized to push the research on the field are reviewed.
Tasks Personality Trait Recognition
Published 2018-04-21
URL https://arxiv.org/abs/1804.08046v3
PDF https://arxiv.org/pdf/1804.08046v3.pdf
PWC https://paperswithcode.com/paper/first-impressions-a-survey-on-vision-based
Repo
Framework

Directed Policy Gradient for Safe Reinforcement Learning with Human Advice

Title Directed Policy Gradient for Safe Reinforcement Learning with Human Advice
Authors Hélène Plisnier, Denis Steckelmacher, Tim Brys, Diederik M. Roijers, Ann Nowé
Abstract Many currently deployed Reinforcement Learning agents work in an environment shared with humans, be them co-workers, users or clients. It is desirable that these agents adjust to people’s preferences, learn faster thanks to their help, and act safely around them. We argue that most current approaches that learn from human feedback are unsafe: rewarding or punishing the agent a-posteriori cannot immediately prevent it from wrong-doing. In this paper, we extend Policy Gradient to make it robust to external directives, that would otherwise break the fundamentally on-policy nature of Policy Gradient. Our technique, Directed Policy Gradient (DPG), allows a teacher or backup policy to override the agent before it acts undesirably, while allowing the agent to leverage human advice or directives to learn faster. Our experiments demonstrate that DPG makes the agent learn much faster than reward-based approaches, while requiring an order of magnitude less advice.
Tasks
Published 2018-08-13
URL http://arxiv.org/abs/1808.04096v1
PDF http://arxiv.org/pdf/1808.04096v1.pdf
PWC https://paperswithcode.com/paper/directed-policy-gradient-for-safe
Repo
Framework

NETT: Solving Inverse Problems with Deep Neural Networks

Title NETT: Solving Inverse Problems with Deep Neural Networks
Authors Housen Li, Johannes Schwab, Stephan Antholzer, Markus Haltmeier
Abstract Recovering a function or high-dimensional parameter vector from indirect measurements is a central task in various scientific areas. Several methods for solving such inverse problems are well developed and well understood. Recently, novel algorithms using deep learning and neural networks for inverse problems appeared. While still in their infancy, these techniques show astonishing performance for applications like low-dose CT or various sparse data problems. However, there are few theoretical results for deep learning in inverse problems. In this paper, we establish a complete convergence analysis for the proposed NETT (Network Tikhonov) approach to inverse problems. NETT considers data consistent solutions having small value of a regularizer defined by a trained neural network. We derive well-posedness results and quantitative error estimates, and propose a possible strategy for training the regularizer. Our theoretical results and framework are different from any previous work using neural networks for solving inverse problems. A possible data driven regularizer is proposed. Numerical results are presented for a tomographic sparse data problem, which demonstrate good performance of NETT even for unknowns of different type from the training data. To derive the convergence and convergence rates results we introduce a new framework based on the absolute Bregman distance generalizing the standard Bregman distance from the convex to the non-convex case.
Tasks
Published 2018-02-28
URL https://arxiv.org/abs/1803.00092v3
PDF https://arxiv.org/pdf/1803.00092v3.pdf
PWC https://paperswithcode.com/paper/nett-solving-inverse-problems-with-deep
Repo
Framework

Application of Machine Learning in Fiber Nonlinearity Modeling and Monitoring for Elastic Optical Networks

Title Application of Machine Learning in Fiber Nonlinearity Modeling and Monitoring for Elastic Optical Networks
Authors Qunbi Zhuge, Xiaobo Zeng, Huazhi Lun, Meng Cai, Xiaomin Liu, Weisheng Hu
Abstract Fiber nonlinear interference (NLI) modeling and monitoring are the key building blocks to support elastic optical networks (EONs). In the past, they were normally developed and investigated separately. Moreover, the accuracy of the previously proposed methods still needs to be improved for heterogenous dynamic optical networks. In this paper, we present the application of machine learning (ML) in NLI modeling and monitoring. In particular, we first propose to use ML approaches to calibrate the errors of current fiber nonlinearity models. The Gaussian-noise (GN) model is used as an illustrative example, and significant improvement is demonstrated with the aid of an artificial neural network (ANN). Further, we propose to use ML to combine the modeling and monitoring schemes for a better estimation of NLI variance. The following contents are the listed errors as mentioned in the comments for reasons of withdrawal. (1) The works, as mentioned as the title, should be addressed is about the elastic optical networks(EON), however, the simulation setup and the results section are focused on the conventional wavelength division multiplexing(WDM) networks. This error may confuse some researcher, getting the misleading decision for the researches about the elastic optical networks. (2) There exists some errors in the results rection, such as, Fig.9(b) and (c) with the wrong captions may result in misleading decision. (3) The split-step-Fourier-method(SSFM) presents good accuracy if the sufficiently small steps are adopted in the calculation, however this paper has not necessary contents and efforts to optimise the step-length of SSFM. This error may confuse the accuracy of simulation results. Therefore, we decide to withdraw this paper from arXiv. The correct and complete paper with the same title was published in journal of lightwave technology with doi: 10.1109/JLT.2019.2910143.
Tasks
Published 2018-11-23
URL https://arxiv.org/abs/1811.11095v2
PDF https://arxiv.org/pdf/1811.11095v2.pdf
PWC https://paperswithcode.com/paper/application-of-machine-learning-in-fiber
Repo
Framework

Attention Mechanisms for Object Recognition with Event-Based Cameras

Title Attention Mechanisms for Object Recognition with Event-Based Cameras
Authors Marco Cannici, Marco Ciccone, Andrea Romanoni, Matteo Matteucci
Abstract Event-based cameras are neuromorphic sensors capable of efficiently encoding visual information in the form of sparse sequences of events. Being biologically inspired, they are commonly used to exploit some of the computational and power consumption benefits of biological vision. In this paper we focus on a specific feature of vision: visual attention. We propose two attentive models for event based vision: an algorithm that tracks events activity within the field of view to locate regions of interest and a fully-differentiable attention procedure based on DRAW neural model. We highlight the strengths and weaknesses of the proposed methods on four datasets, the Shifted N-MNIST, Shifted MNIST-DVS, CIFAR10-DVS and N-Caltech101 collections, using the Phased LSTM recognition network as a baseline reference model obtaining improvements in terms of both translation and scale invariance.
Tasks Event-based vision, Object Recognition
Published 2018-07-25
URL http://arxiv.org/abs/1807.09480v2
PDF http://arxiv.org/pdf/1807.09480v2.pdf
PWC https://paperswithcode.com/paper/attention-mechanisms-for-object-recognition
Repo
Framework

A Convergence Analysis of Gradient Descent for Deep Linear Neural Networks

Title A Convergence Analysis of Gradient Descent for Deep Linear Neural Networks
Authors Sanjeev Arora, Nadav Cohen, Noah Golowich, Wei Hu
Abstract We analyze speed of convergence to global optimum for gradient descent training a deep linear neural network (parameterized as $x \mapsto W_N W_{N-1} \cdots W_1 x$) by minimizing the $\ell_2$ loss over whitened data. Convergence at a linear rate is guaranteed when the following hold: (i) dimensions of hidden layers are at least the minimum of the input and output dimensions; (ii) weight matrices at initialization are approximately balanced; and (iii) the initial loss is smaller than the loss of any rank-deficient solution. The assumptions on initialization (conditions (ii) and (iii)) are necessary, in the sense that violating any one of them may lead to convergence failure. Moreover, in the important case of output dimension 1, i.e. scalar regression, they are met, and thus convergence to global optimum holds, with constant probability under a random initialization scheme. Our results significantly extend previous analyses, e.g., of deep linear residual networks (Bartlett et al., 2018).
Tasks
Published 2018-10-04
URL https://arxiv.org/abs/1810.02281v3
PDF https://arxiv.org/pdf/1810.02281v3.pdf
PWC https://paperswithcode.com/paper/a-convergence-analysis-of-gradient-descent
Repo
Framework

Deep neural network based sparse measurement matrix for image compressed sensing

Title Deep neural network based sparse measurement matrix for image compressed sensing
Authors Wenxue Cui, Feng Jiang, Xinwei Gao, Wen Tao, Debin Zhao
Abstract Gaussian random matrix (GRM) has been widely used to generate linear measurements in compressed sensing (CS) of natural images. However, there actually exist two disadvantages with GRM in practice. One is that GRM has large memory requirement and high computational complexity, which restrict the applications of CS. Another is that the CS measurements randomly obtained by GRM cannot provide sufficient reconstruction performances. In this paper, a Deep neural network based Sparse Measurement Matrix (DSMM) is learned by the proposed convolutional network to reduce the sampling computational complexity and improve the CS reconstruction performance. Two sub networks are included in the proposed network, which are the sampling sub-network and the reconstruction sub-network. In the sampling sub-network, the sparsity and the normalization are both considered by the limitation of the storage and the computational complexity. In order to improve the CS reconstruction performance, a reconstruction sub-network are introduced to help enhance the sampling sub-network. So by the offline iterative training of the proposed end-to-end network, the DSMM is generated for accurate measurement and excellent reconstruction. Experimental results demonstrate that the proposed DSMM outperforms GRM greatly on representative CS reconstruction methods
Tasks
Published 2018-06-19
URL http://arxiv.org/abs/1806.07026v1
PDF http://arxiv.org/pdf/1806.07026v1.pdf
PWC https://paperswithcode.com/paper/deep-neural-network-based-sparse-measurement
Repo
Framework

Riemannian Adaptive Optimization Methods

Title Riemannian Adaptive Optimization Methods
Authors Gary Bécigneul, Octavian-Eugen Ganea
Abstract Several first order stochastic optimization methods commonly used in the Euclidean domain such as stochastic gradient descent (SGD), accelerated gradient descent or variance reduced methods have already been adapted to certain Riemannian settings. However, some of the most popular of these optimization tools - namely Adam , Adagrad and the more recent Amsgrad - remain to be generalized to Riemannian manifolds. We discuss the difficulty of generalizing such adaptive schemes to the most agnostic Riemannian setting, and then provide algorithms and convergence proofs for geodesically convex objectives in the particular case of a product of Riemannian manifolds, in which adaptivity is implemented across manifolds in the cartesian product. Our generalization is tight in the sense that choosing the Euclidean space as Riemannian manifold yields the same algorithms and regret bounds as those that were already known for the standard algorithms. Experimentally, we show faster convergence and to a lower train loss value for Riemannian adaptive methods over their corresponding baselines on the realistic task of embedding the WordNet taxonomy in the Poincare ball.
Tasks Stochastic Optimization
Published 2018-10-01
URL http://arxiv.org/abs/1810.00760v2
PDF http://arxiv.org/pdf/1810.00760v2.pdf
PWC https://paperswithcode.com/paper/riemannian-adaptive-optimization-methods
Repo
Framework

Fine-grained wound tissue analysis using deep neural network

Title Fine-grained wound tissue analysis using deep neural network
Authors Hossein Nejati, Hamed Alizadeh Ghazijahani, Milad Abdollahzadeh, Tooba Malekzadeh, Ngai-Man Cheung, Kheng Hock Lee, Lian Leng Low
Abstract Tissue assessment for chronic wounds is the basis of wound grading and selection of treatment approaches. While several image processing approaches have been proposed for automatic wound tissue analysis, there has been a shortcoming in these approaches for clinical practices. In particular, seemingly, all previous approaches have assumed only 3 tissue types in the chronic wounds, while these wounds commonly exhibit 7 distinct tissue types that presence of each one changes the treatment procedure. In this paper, for the first time, we investigate the classification of 7 wound issue types. We work with wound professionals to build a new database of 7 types of wound tissue. We propose to use pre-trained deep neural networks for feature extraction and classification at the patch-level. We perform experiments to demonstrate that our approach outperforms other state-of-the-art. We will make our database publicly available to facilitate research in wound assessment.
Tasks
Published 2018-02-28
URL http://arxiv.org/abs/1802.10426v1
PDF http://arxiv.org/pdf/1802.10426v1.pdf
PWC https://paperswithcode.com/paper/fine-grained-wound-tissue-analysis-using-deep
Repo
Framework

Optimal Adaptive and Accelerated Stochastic Gradient Descent

Title Optimal Adaptive and Accelerated Stochastic Gradient Descent
Authors Qi Deng, Yi Cheng, Guanghui Lan
Abstract Stochastic gradient descent (\textsc{Sgd}) methods are the most powerful optimization tools in training machine learning and deep learning models. Moreover, acceleration (a.k.a. momentum) methods and diagonal scaling (a.k.a. adaptive gradient) methods are the two main techniques to improve the slow convergence of \textsc{Sgd}. While empirical studies have demonstrated potential advantages of combining these two techniques, it remains unknown whether these methods can achieve the optimal rate of convergence for stochastic optimization. In this paper, we present a new class of adaptive and accelerated stochastic gradient descent methods and show that they exhibit the optimal sampling and iteration complexity for stochastic optimization. More specifically, we show that diagonal scaling, initially designed to improve vanilla stochastic gradient, can be incorporated into accelerated stochastic gradient descent to achieve the optimal rate of convergence for smooth stochastic optimization. We also show that momentum, apart from being known to speed up the convergence rate of deterministic optimization, also provides us new ways of designing non-uniform and aggressive moving average schemes in stochastic optimization. Finally, we present some heuristics that help to implement adaptive accelerated stochastic gradient descent methods and to further improve their practical performance for machine learning and deep learning.
Tasks Stochastic Optimization
Published 2018-10-01
URL http://arxiv.org/abs/1810.00553v1
PDF http://arxiv.org/pdf/1810.00553v1.pdf
PWC https://paperswithcode.com/paper/optimal-adaptive-and-accelerated-stochastic
Repo
Framework
comments powered by Disqus