Friday, April 6, 2007

First read: The IPCC WG II Summary for Policymakers

To quote Vinnie Barbarino "I'm so confused"
Starting with the top of the second page. This Summary refers to a report that hasn't been released yet:
"A full consideration of observed climate change is provided in the IPCC Working Group I Fourth Assessment."

By the middle of page two we're off to language I didn't learn in high school statistics:
"That Assessment concluded that “there is high confidence 3"; dutifully following footnote 3 to:

"Endbox 2. Likelihood and confidence language

In this Summary for Policymakers, the following terms have been used to indicate: the assessed likelihood of an outcome or a result:

Virtually certain > 99% probability of occurrence, Extremely likely > 95%, Very likely > 90%Likely > 66%, More likely than not > 50%, Very unlikely < 10%, Extremely unlikely < 5%.

The following terms have been used to express confidence in a statement:

Very high confidence At least a 9 out of 10 chance of being correct, High confidence About an 8 out of 10 chance, Medium confidence About a 5 out of 10 chance, Low confidence About a 2 out of 10 chance, Very low confidence Less than a 1 out of 10 chance."

Where is the 95% confidence interval? It's not a probability statement.

The confidence interval is the range where you expect something to be. By saying "expect" you leave open the possibility of being wrong. The degree of confidence measures the probability of that expectation to be true.

The degree of confidence is linked with the width of the confidence interval. It's easy to be very confident that something will be within a very wide range, and vice versa. Also, the amount of information (typically related with the sample size) has an influence on the degree of confidence and the width of the confidence interval. With more information you will be more confident that "the thing" will be within a given interval. Also, with more information, and keeping a given degree of confidence, you can narrow the interval.

An example:
In a given city a survey is made. The question is: "Do you prefer Coke or Pepsi?" 60% answer Coke, and 40% answer Pepsi. So an estimation is that, in this city, 60% prefer Coke. Does it means that 60% of the population in this city prefer Coke? No unless the survey had been answered by all the population. However, you can be somewhat "confident" that the actual proportion of people choosing Coke will be within some interval around the 60% found in the sample. How confident? How wide is the interval?

If the survey is based on a sample of 100 persons, you can be 90% confident that the actual proportion of Coke will be between 52% and 68%. Also, you can be 99% confident that the actual proportion will be between 48% and 72% (for the same sample size, more confidence, wider interval).

If the survey had been on a sample of 1000 persons instead of 100, you could be 90% confident that the actual proportion is between 57.5% and 62.5% (compare with 52% and 68% for the same confidence with a sample of 100. Larger sample, narrower interval for the same degree of confidence). And you could be 99.99998% (let's say 100%?) confident that the actual proportiion will be between 52% and 68% (compare with a degree of confidence of 90% for the same interval with a sample of 100. Larger sample, better degree of confidence for the same interval).

This Summary for Policymakers looks more like a political document than a scientific one.