Permalink
Cannot retrieve contributors at this time
Name already in use
A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
cisc/README.md
Go to fileThis commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
4 lines (3 sloc)
1.36 KB
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Accurate Causal Inference on Discrete Data | |
Additive Noise Models (ANMs) provide a theoretically sound approach for inferring the most likely causal direction between pairs of random variables given only a sample from their joint distribution. The key assumption is that the effect is a function of the cause, with additive noise that is independent of the cause. In many cases ANMs are identifiable. Their performance, however, hinges on the chosen dependence measure, the assumption we make on the true distribution, and on the sample size. | |
In this paper we propose to use Shannon entropy to measure the dependence within an ANM, which gives us a general approach by which we do not have to assume a true distribution, nor have to perform explicit significance tests during optimization. Moreover, through the Minumum Description Length principle, we further show the direct connection between this ANM formulation and the more general Algorithmic Markov Condition (AMC). While practical instantiations of the AMC have so far not been known to be identifiable, we show that under certain adjustments using ANMs this is possible. Our information theoretic formulation gives us a general, efficient, identifiable, and, as the experiments show, highly accurate method for causal inference on pairs of discrete variables---achieving (near) 100% accuracy on both synthetic and real-world data. |