July 28, 2019

2768 words 13 mins read

Paper Group ANR 395

Paper Group ANR 395

Does Weather Matter? Causal Analysis of TV Logs. Scale-invariant temporal history (SITH): optimal slicing of the past in an uncertain world. A Method for Determining Weights of Criterias and Alternative of Fuzzy Group Decision Making Problem. A proposal for ethically traceable artificial intelligence. Learning Independent Causal Mechanisms. Learnin …

Does Weather Matter? Causal Analysis of TV Logs

Title Does Weather Matter? Causal Analysis of TV Logs
Authors Shi Zong, Branislav Kveton, Shlomo Berkovsky, Azin Ashkan, Nikos Vlassis, Zheng Wen
Abstract Weather affects our mood and behaviors, and many aspects of our life. When it is sunny, most people become happier; but when it rains, some people get depressed. Despite this evidence and the abundance of data, weather has mostly been overlooked in the machine learning and data science research. This work presents a causal analysis of how weather affects TV watching patterns. We show that some weather attributes, such as pressure and precipitation, cause major changes in TV watching patterns. To the best of our knowledge, this is the first large-scale causal study of the impact of weather on TV watching patterns.
Tasks
Published 2017-01-25
URL http://arxiv.org/abs/1701.08716v2
PDF http://arxiv.org/pdf/1701.08716v2.pdf
PWC https://paperswithcode.com/paper/does-weather-matter-causal-analysis-of-tv
Repo
Framework

Scale-invariant temporal history (SITH): optimal slicing of the past in an uncertain world

Title Scale-invariant temporal history (SITH): optimal slicing of the past in an uncertain world
Authors Tyler A. Spears, Brandon G. Jacques, Marc W. Howard, Per B. Sederberg
Abstract In both the human brain and any general artificial intelligence (AI), a representation of the past is necessary to predict the future. However, perfect storage of all experiences is not feasible. One approach utilized in many applications, including reward prediction in reinforcement learning, is to retain recently active features of experience in a buffer. Despite its prior successes, we show that the fixed length buffer renders Deep Q-learning Networks (DQNs) fragile to changes in the scale over which information can be learned. To enable learning when the relevant temporal scales in the environment are not known a priori, recent advances in psychology and neuroscience suggest that the brain maintains a compressed representation of the past. Here we introduce a neurally-plausible, scale-free memory representation we call Scale-Invariant Temporal History (SITH) for use with artificial agents. This representation covers an exponentially large period of time by sacrificing temporal accuracy for events further in the past. We demonstrate the utility of this representation by comparing the performance of agents given SITH, buffer, and exponential decay representations in learning to play video games at different levels of complexity. In these environments, SITH exhibits better learning performance by storing information for longer timescales than a fixed-size buffer, and representing this information more clearly than a set of exponentially decayed features. Finally, we discuss how the application of SITH, along with other human-inspired models of cognition, could improve reinforcement and machine learning algorithms in general.
Tasks Q-Learning
Published 2017-12-19
URL http://arxiv.org/abs/1712.07165v3
PDF http://arxiv.org/pdf/1712.07165v3.pdf
PWC https://paperswithcode.com/paper/scale-invariant-temporal-history-sith-optimal
Repo
Framework

A Method for Determining Weights of Criterias and Alternative of Fuzzy Group Decision Making Problem

Title A Method for Determining Weights of Criterias and Alternative of Fuzzy Group Decision Making Problem
Authors Jon JaeGyong, Mun JongHui, Ryang GyongIl
Abstract In this paper, we constructed a model to determine weights of criterias and presented a solution for determining the optimal alternative by using the constructed model and relationship analysis between criterias in fuzzy group decision-making problem with different forms of preference information of decision makers on criterias.
Tasks Decision Making
Published 2017-05-16
URL http://arxiv.org/abs/1705.05515v1
PDF http://arxiv.org/pdf/1705.05515v1.pdf
PWC https://paperswithcode.com/paper/a-method-for-determining-weights-of-criterias
Repo
Framework

A proposal for ethically traceable artificial intelligence

Title A proposal for ethically traceable artificial intelligence
Authors Christopher A. Tucker
Abstract Although the problem of a critique of robotic behavior in near-unanimous agreement to human norms seems intractable, a starting point of such an ambition is a framework of the collection of knowledge a priori and experience a posteriori categorized as a set of synthetical judgments available to the intelligence, translated into computer code. If such a proposal were successful, an algorithm with ethically traceable behavior and cogent equivalence to human cognition is established. This paper will propose the application of Kant’s critique of reason to current programming constructs of an autonomous intelligent system.
Tasks
Published 2017-03-06
URL http://arxiv.org/abs/1703.01908v2
PDF http://arxiv.org/pdf/1703.01908v2.pdf
PWC https://paperswithcode.com/paper/a-proposal-for-ethically-traceable-artificial
Repo
Framework

Learning Independent Causal Mechanisms

Title Learning Independent Causal Mechanisms
Authors Giambattista Parascandolo, Niki Kilbertus, Mateo Rojas-Carulla, Bernhard Schölkopf
Abstract Statistical learning relies upon data sampled from a distribution, and we usually do not care what actually generated it in the first place. From the point of view of causal modeling, the structure of each distribution is induced by physical mechanisms that give rise to dependences between observables. Mechanisms, however, can be meaningful autonomous modules of generative models that make sense beyond a particular entailed data distribution, lending themselves to transfer between problems. We develop an algorithm to recover a set of independent (inverse) mechanisms from a set of transformed data points. The approach is unsupervised and based on a set of experts that compete for data generated by the mechanisms, driving specialization. We analyze the proposed method in a series of experiments on image data. Each expert learns to map a subset of the transformed data back to a reference distribution. The learned mechanisms generalize to novel domains. We discuss implications for transfer learning and links to recent trends in generative modeling.
Tasks Transfer Learning
Published 2017-12-04
URL http://arxiv.org/abs/1712.00961v5
PDF http://arxiv.org/pdf/1712.00961v5.pdf
PWC https://paperswithcode.com/paper/learning-independent-causal-mechanisms
Repo
Framework

Learning linear structural equation models in polynomial time and sample complexity

Title Learning linear structural equation models in polynomial time and sample complexity
Authors Asish Ghoshal, Jean Honorio
Abstract The problem of learning structural equation models (SEMs) from data is a fundamental problem in causal inference. We develop a new algorithm — which is computationally and statistically efficient and works in the high-dimensional regime — for learning linear SEMs from purely observational data with arbitrary noise distribution. We consider three aspects of the problem: identifiability, computational efficiency, and statistical efficiency. We show that when data is generated from a linear SEM over $p$ nodes and maximum degree $d$, our algorithm recovers the directed acyclic graph (DAG) structure of the SEM under an identifiability condition that is more general than those considered in the literature, and without faithfulness assumptions. In the population setting, our algorithm recovers the DAG structure in $\mathcal{O}(p(d^2 + \log p))$ operations. In the finite sample setting, if the estimated precision matrix is sparse, our algorithm has a smoothed complexity of $\widetilde{\mathcal{O}}(p^3 + pd^7)$, while if the estimated precision matrix is dense, our algorithm has a smoothed complexity of $\widetilde{\mathcal{O}}(p^5)$. For sub-Gaussian noise, we show that our algorithm has a sample complexity of $\mathcal{O}(\frac{d^8}{\varepsilon^2} \log (\frac{p}{\sqrt{\delta}}))$ to achieve $\varepsilon$ element-wise additive error with respect to the true autoregression matrix with probability at most $1 - \delta$, while for noise with bounded $(4m)$-th moment, with $m$ being a positive integer, our algorithm has a sample complexity of $\mathcal{O}(\frac{d^8}{\varepsilon^2} (\frac{p^2}{\delta})^{1/m})$.
Tasks Causal Inference
Published 2017-07-15
URL http://arxiv.org/abs/1707.04673v1
PDF http://arxiv.org/pdf/1707.04673v1.pdf
PWC https://paperswithcode.com/paper/learning-linear-structural-equation-models-in
Repo
Framework

Determinants of Mobile Money Adoption in Pakistan

Title Determinants of Mobile Money Adoption in Pakistan
Authors Muhammad Raza Khan, Joshua Blumenstock
Abstract In this work, we analyze the problem of adoption of mobile money in Pakistan by using the call detail records of a major telecom company as our input. Our results highlight the fact that different sections of the society have different patterns of adoption of digital financial services but user mobility related features are the most important one when it comes to adopting and using mobile money services.
Tasks
Published 2017-11-13
URL http://arxiv.org/abs/1712.01081v1
PDF http://arxiv.org/pdf/1712.01081v1.pdf
PWC https://paperswithcode.com/paper/determinants-of-mobile-money-adoption-in
Repo
Framework

Learning Hard Alignments with Variational Inference

Title Learning Hard Alignments with Variational Inference
Authors Dieterich Lawson, Chung-Cheng Chiu, George Tucker, Colin Raffel, Kevin Swersky, Navdeep Jaitly
Abstract There has recently been significant interest in hard attention models for tasks such as object recognition, visual captioning and speech recognition. Hard attention can offer benefits over soft attention such as decreased computational cost, but training hard attention models can be difficult because of the discrete latent variables they introduce. Previous work used REINFORCE and Q-learning to approach these issues, but those methods can provide high-variance gradient estimates and be slow to train. In this paper, we tackle the problem of learning hard attention for a sequential task using variational inference methods, specifically the recently introduced VIMCO and NVIL. Furthermore, we propose a novel baseline that adapts VIMCO to this setting. We demonstrate our method on a phoneme recognition task in clean and noisy environments and show that our method outperforms REINFORCE, with the difference being greater for a more complicated task.
Tasks Image Captioning, Object Recognition, Q-Learning, Speech Recognition
Published 2017-05-16
URL http://arxiv.org/abs/1705.05524v2
PDF http://arxiv.org/pdf/1705.05524v2.pdf
PWC https://paperswithcode.com/paper/learning-hard-alignments-with-variational
Repo
Framework

The information bottleneck and geometric clustering

Title The information bottleneck and geometric clustering
Authors D J Strouse, David J Schwab
Abstract The information bottleneck (IB) approach to clustering takes a joint distribution $P!\left(X,Y\right)$ and maps the data $X$ to cluster labels $T$ which retain maximal information about $Y$ (Tishby et al., 1999). This objective results in an algorithm that clusters data points based upon the similarity of their conditional distributions $P!\left(Y\mid X\right)$. This is in contrast to classic “geometric clustering” algorithms such as $k$-means and gaussian mixture models (GMMs) which take a set of observed data points $\left{ \mathbf{x}{i}\right}{i=1:N}$ and cluster them based upon their geometric (typically Euclidean) distance from one another. Here, we show how to use the deterministic information bottleneck (DIB) (Strouse and Schwab, 2017), a variant of IB, to perform geometric clustering, by choosing cluster labels that preserve information about data point location on a smoothed dataset. We also introduce a novel intuitive method to choose the number of clusters, via kinks in the information curve. We apply this approach to a variety of simple clustering problems, showing that DIB with our model selection procedure recovers the generative cluster labels. We also show that, for one simple case, DIB interpolates between the cluster boundaries of GMMs and $k$-means in the large data limit. Thus, our IB approach to clustering also provides an information-theoretic perspective on these classic algorithms.
Tasks Model Selection
Published 2017-12-27
URL http://arxiv.org/abs/1712.09657v1
PDF http://arxiv.org/pdf/1712.09657v1.pdf
PWC https://paperswithcode.com/paper/the-information-bottleneck-and-geometric
Repo
Framework

A Recursive Bayesian Approach To Describe Retinal Vasculature Geometry

Title A Recursive Bayesian Approach To Describe Retinal Vasculature Geometry
Authors Fatmatulzehra Uslu, Anil Anthony Bharath
Abstract Demographic studies suggest that changes in the retinal vasculature geometry, especially in vessel width, are associated with the incidence or progression of eye-related or systemic diseases. To date, the main information source for width estimation from fundus images has been the intensity profile between vessel edges. However, there are many factors affecting the intensity profile: pathologies, the central light reflex and local illumination levels, to name a few. In this study, we introduce three information sources for width estimation. These are the probability profiles of vessel interior, centreline and edge locations generated by a deep network. The probability profiles provide direct access to vessel geometry and are used in the likelihood calculation for a Bayesian method, particle filtering. We also introduce a geometric model which can handle non-ideal conditions of the probability profiles. Our experiments conducted on the REVIEW dataset yielded consistent estimates of vessel width, even in cases when one of the vessel edges is difficult to identify. Moreover, our results suggest that the method is better than human observers at locating edges of low contrast vessels.
Tasks
Published 2017-11-28
URL http://arxiv.org/abs/1711.10521v1
PDF http://arxiv.org/pdf/1711.10521v1.pdf
PWC https://paperswithcode.com/paper/a-recursive-bayesian-approach-to-describe
Repo
Framework

Mining a Sub-Matrix of Maximal Sum

Title Mining a Sub-Matrix of Maximal Sum
Authors Vincent Branders, Pierre Schaus, Pierre Dupont
Abstract Biclustering techniques have been widely used to identify homogeneous subgroups within large data matrices, such as subsets of genes similarly expressed across subsets of patients. Mining a max-sum sub-matrix is a related but distinct problem for which one looks for a (non-necessarily contiguous) rectangular sub-matrix with a maximal sum of its entries. Le Van et al. (Ranked Tiling, 2014) already illustrated its applicability to gene expression analysis and addressed it with a constraint programming (CP) approach combined with large neighborhood search (CP-LNS). In this work, we exhibit some key properties of this NP-hard problem and define a bounding function such that larger problems can be solved in reasonable time. Two different algorithms are proposed in order to exploit the highlighted characteristics of the problem: a CP approach with a global constraint (CPGC) and mixed integer linear programming (MILP). Practical experiments conducted both on synthetic and real gene expression data exhibit the characteristics of these approaches and their relative benefits over the original CP-LNS method. Overall, the CPGC approach tends to be the fastest to produce a good solution. Yet, the MILP formulation is arguably the easiest to formulate and can also be competitive.
Tasks
Published 2017-09-25
URL http://arxiv.org/abs/1709.08461v1
PDF http://arxiv.org/pdf/1709.08461v1.pdf
PWC https://paperswithcode.com/paper/mining-a-sub-matrix-of-maximal-sum
Repo
Framework

The Effectiveness of Data Augmentation for Detection of Gastrointestinal Diseases from Endoscopical Images

Title The Effectiveness of Data Augmentation for Detection of Gastrointestinal Diseases from Endoscopical Images
Authors Andrea Asperti, Claudio Mastronardo
Abstract The lack, due to privacy concerns, of large public databases of medical pathologies is a well-known and major problem, substantially hindering the application of deep learning techniques in this field. In this article, we investigate the possibility to supply to the deficiency in the number of data by means of data augmentation techniques, working on the recent Kvasir dataset of endoscopical images of gastrointestinal diseases. The dataset comprises 4,000 colored images labeled and verified by medical endoscopists, covering a few common pathologies at different anatomical landmarks: Z-line, pylorus and cecum. We show how the application of data augmentation techniques allows to achieve sensible improvements of the classification with respect to previous approaches, both in terms of precision and recall.
Tasks Data Augmentation
Published 2017-12-11
URL http://arxiv.org/abs/1712.03689v1
PDF http://arxiv.org/pdf/1712.03689v1.pdf
PWC https://paperswithcode.com/paper/the-effectiveness-of-data-augmentation-for
Repo
Framework

Learning a Complete Image Indexing Pipeline

Title Learning a Complete Image Indexing Pipeline
Authors Himalaya Jain, Joaquin Zepeda, Patrick Pérez, Rémi Gribonval
Abstract To work at scale, a complete image indexing system comprises two components: An inverted file index to restrict the actual search to only a subset that should contain most of the items relevant to the query; An approximate distance computation mechanism to rapidly scan these lists. While supervised deep learning has recently enabled improvements to the latter, the former continues to be based on unsupervised clustering in the literature. In this work, we propose a first system that learns both components within a unifying neural framework of structured binary encoding.
Tasks
Published 2017-12-12
URL http://arxiv.org/abs/1712.04480v1
PDF http://arxiv.org/pdf/1712.04480v1.pdf
PWC https://paperswithcode.com/paper/learning-a-complete-image-indexing-pipeline
Repo
Framework

On Residual CNN in text-dependent speaker verification task

Title On Residual CNN in text-dependent speaker verification task
Authors Egor Malykh, Sergey Novoselov, Oleg Kudashev
Abstract Deep learning approaches are still not very common in the speaker verification field. We investigate the possibility of using deep residual convolutional neural network with spectrograms as an input features in the text-dependent speaker verification task. Despite the fact that we were not able to surpass the baseline system in quality, we achieved a quite good results for such a new approach getting an 5.23% ERR on the RSR2015 evaluation part. Fusion of the baseline and proposed systems outperformed the best individual system by 18% relatively.
Tasks Speaker Verification, Text-Dependent Speaker Verification
Published 2017-05-29
URL http://arxiv.org/abs/1705.10134v2
PDF http://arxiv.org/pdf/1705.10134v2.pdf
PWC https://paperswithcode.com/paper/on-residual-cnn-in-text-dependent-speaker
Repo
Framework

Skyline Identification in Multi-Armed Bandits

Title Skyline Identification in Multi-Armed Bandits
Authors Albert Cheu, Ravi Sundaram, Jonathan Ullman
Abstract We introduce a variant of the classical PAC multi-armed bandit problem. There is an ordered set of $n$ arms $A[1],\dots,A[n]$, each with some stochastic reward drawn from some unknown bounded distribution. The goal is to identify the $skyline$ of the set $A$, consisting of all arms $A[i]$ such that $A[i]$ has larger expected reward than all lower-numbered arms $A[1],\dots,A[i-1]$. We define a natural notion of an $\varepsilon$-approximate skyline and prove matching upper and lower bounds for identifying an $\varepsilon$-skyline. Specifically, we show that in order to identify an $\varepsilon$-skyline from among $n$ arms with probability $1-\delta$, $$ \Theta\bigg(\frac{n}{\varepsilon^2} \cdot \min\bigg{ \log\bigg(\frac{1}{\varepsilon \delta}\bigg), \log\bigg(\frac{n}{\delta}\bigg) \bigg} \bigg) $$ samples are necessary and sufficient. When $\varepsilon \gg 1/n$, our results improve over the naive algorithm, which draws enough samples to approximate the expected reward of every arm; the algorithm of (Auer et al., AISTATS’16) for Pareto-optimal arm identification is likewise superseded. Our results show that the sample complexity of the skyline problem lies strictly in between that of best arm identification (Even-Dar et al., COLT’02) and that of approximating the expected reward of every arm.
Tasks Multi-Armed Bandits
Published 2017-11-12
URL http://arxiv.org/abs/1711.04213v2
PDF http://arxiv.org/pdf/1711.04213v2.pdf
PWC https://paperswithcode.com/paper/skyline-identification-in-multi-armed-bandits
Repo
Framework
comments powered by Disqus