sofiechan home

Kolmogorov Paranoia: Extraordinary Evidence Probably Isn't.

anon_toli said in #3282 3d ago:

I enjoyed this takedown of Scott Alexander's support for the COVID natural origins theory. Basically, Scott did a big "bayesian" analysis of the evidence for and against the idea that COVID originated in the lab vs naturally. As per his usual pre-written conclusion in support of liberal hegemony, he concluded that it probably wasn't a lab leak. The problem is that his argument hinged on one extraordinarily large (in a technical sense) piece of evidence: a large "bayes factor" (log likelihood ratio) from adding up a lot of early case location data points as if they were independent evidence. Along comes this guy Michael Weissman to point out that every other piece of evidence went the other way, and why you can't just assume extraordinary evidence is what you think it is:

https://michaelweissman.substack.com/p/open-letter-to-scott-alexander

The basic concept is that when you run into apparently strong evidence with extraordinarily high power, especially that comes from adding up many supposedly independent facts of the same type from the same source, you can't just assume the framing anymore. The evidence power demands that you dredge up any possible alternative explanation from an increasingly large universe of possible explanations, and as the apparent power of the evidence grows, any particular explanation for it becomes increasingly questionable, and vanishingly unlikely a-priori. As such, you actually have to strongly discount apparently large sample sizes as almost certainly non-independent.

A concrete example: someone comes to you and says that 99/100 experts surveyed agree on some fact (global warming, covid origins, etc). On the surface, this is presented as if it's a 10x larger sample size and extraordinarily stronger evidence than 9/10 experts agreeing on the same. If you naively assume the sample size, you are compelled to allow that immensely strong evidence to overwhelm all other common sense and convince you of the fact. After all, how probable is it that they are all in on the same conspiracy?

Fairly high, actually. Common sense will tell you that probably the all read the same papers, exist in the same social milieu, have similar cultural biases, got their opinions from their friends at the lab, etc. These are *not* independent samples. Intuitively you should count them as something more like 3-5 independent samples.

I was doing a statistics project the other day and wanted a technical operationalization of this kind of skepticism and a somewhat more rigorous foundation for it, to make the aggregation of many different forms of evidence robust to this kind of hidden non-independence. The heuristic I came up with is that effective sample size is N' = G*log(1+N/G), where G is the "gullibility factor" representing how large of a sample size you will take at face value before starting to strongly question independence. For the experts example above, my gullibility factor is about 3 experts. Point being that you need to explicitly reason about and justify any nontrivially large G-factor with extraordinary arguments, and use N' instead N for any sample size calculations (significance tests hardest hit etc).

The reason for the logarithmic functional form is that it tracks the kolmogorov complexity (description length and thus prior improbability via occams razor) of hypotheses that produce that amount of evidence non-independently. "This much data with this bias" should in the limit be interpreted proportional to *that* description length rather than the raw apparent entropy of the generated data. This "kolmogorov paranoia" is a way to explicitly allow for that increasing space of possible non-independent explanations of the data without having to explicitly argue any particular likely alternative hypothesis (as per the size of that space, a-priori there aren't any!) or explicitly model the non-independence.

Kolmogorov paranoia rigorously nukes the various bad arguments that depend on particular high powered evidence or arguments but that go against common sense.

I enjoyed this taked

You must login to post.