**Introduction to Confidence Interval Estimates
2015(c) Thomas G. Groleau **(revised March 2018)

**Introduction**

Here is one of the most important concepts in statistics: **Interval
estimate = point estimate + margin of error**.

What's a point estimate? A "statistic" is something calculated from
sample data such as a sample mean or sample proportion. If we use that
value to estimate a characteristic of a population (also called a
parameter), then it becomes a **point estimate**.

For example, suppose I want to know the mean annual income of all CPAs in
the state of Ohio. If I gather a representative sample of 75 CPAs and find
their mean annual income is $65,730, then I shouldn't expect the mean
annual income of **all** CPAs to be exactly $65,730. Of all
the numbers in the universe, this is the just the **point**
on the number line that I would use as my **estimate**. I
would hope that this estimate is 'close' to the parameter for all CPAs but
I need a **margin of error** to know how close. I also need
something called a confidence level, but we'll worry about that later.

**Estimating Proportions**

Read this article. It makes many data-based claims that could be discussed but we'll focus on the headline and the first two paragraphs.

The headline says "Almost Half of Americans Aren't Saving Nearly Enough". That sounds like a lot of people but what do they mean by "almost half" What is meant by"Americans"? We get the first answer in the first paragraph "A reported 46 percent of the country's consumers store less than 5 percent of their annual incomes into longer-term savings." That's more specific. It's not just "almost half", it's 46%.

But how did they get that number? They didn't ask me. Did they ask you? If they didn't ask all "Americans" then how can they claim to know anything about all of them?

We get some answers in the second paragraph: "The study, which surveyed 1,000 adults living in the continental U.S" This answers a lot of questions. "Americans" in the headline is a poorly defined term. Anyone living in North, Central or South America could legitimately claim to be an "American". This study restricts the term to "adults" (another undefined term) living in the continental U.S. (sorry Alaskans and Hawaiians).

The headline isn't necessarily wrong, it just isn't very specific. Now that we know a bit more about the study we can make a much clearer statement: Of 1,000 adults surveyed, 46% claim to store less than 5% of their annual incomes in longer-term savings. Do you think it's reasonable to survey 1,000 U.S. adults and apply the results to ALL U.S. adults?

Read the following articles (Marijuana Use, Race Relations, Gluten-Free) and answer the following questions for each. You might need to dig into a link or two from the original article.

- What is the question? In other words, to get counted in the "yes" percent, what had to be true about the survey respondent?
- If there seem to be competing "yes" groups just select the one you are most interested in.
- How large was the sample (i.e. how many people were surveyed)?
- What population are they trying to apply the results to?
- Does the article say anything about a margin of error? If so, what is it?
- Do you think there still ill-defined terms? If so, which ones?

Here are the answers for the savings article.

- The respondent had to store less than 5% of their annual income in longer-term savings.
- One thousand people.
- All adults in the United States.
- No mention of margin of error.
- In this case, no definition is provided for "adult" or "longer-term savings".

*See if you can answer these for the other articles before
reading further.*

Before we dig back into the first article or the other examples, let's look at a much simpler example. Suppose I have 150 poker chips. That's all I have. The entire universe (or population) is 150 poker ships. Of these, 69 are blue. That's 46%, almost half, of the entire population.

Now suppose that you don't know how many were blue. You just have the bag of chips and you don't have time to count them all. You're going to randomly select 20 of them. What percent do you think will be blue? (Hint: it will NOT be 46%.)

If you have time, you could get some poker chips and try it a few times. For now, we'll let a computer simulate what might happen. You can download the spreadsheet file here and run a few simulations. The file has VBA macros and it will not work until you enable the macros. Watch this video before reading any further. You might want to watch the video before you try the simulation yourself.

Note two things. First, the number of blue chips in the bag never
changed. Therefore the 46% was constant. This is called the population
proportion (notated* p* or *π* depending on the textbook).
Second, our answers varied. These were sample statistics, specifically
sample proportions (notated *p̄*). These *p̄*'s
are part of a sampling distribution. The **sampling distribution of
sample proportions** is the distribution of ALL possible *p̄*'s
when randomly selecting 20 out of 150 of these chips.

There are many places to read about sampling distribution theory but we
want to get to main point as quickly as possible. The *p̄*'s
follow an approximate normal distribution which means that about 95% of
them will be within 2 standard deviations of the mean. If we know the mean
and standard deviation of that distribution, then we can predict a range
of values (an interval) where 95% of the *p̄*'s will
occur.

In this case, the mean of all *p̄*'s is 0.46 and the
standard deviation is 0.025 (rounded). This isn't magic. Something called
the Central Limit Theorem tells us that, under certain circumstances,
sample distributions for *p̄*'s will always follow an
approximate normal distribution with

This standard deviation has a special name. It's called a "standard error". The sample statistic is used to estimate a population parameter. It's rare that this estimate will be exactly correct. In other words, there's usually an "error" in the estimate. This standard deviation measures the variability of that error. Thus the name "standard error".

Using the mean and standard deviation for this example, 95% of the *p̄*'s
*should* be between 0.242 and 0.678. Before we figure out where we
got those numbers, let's look back at the simulation spreadsheet. Look at
yours and see if this is close to what happened. Here's a screen shot from
mine showing that 96.6% of the *p̄*'s were between those
two values. That's pretty close to the predicted 95%.

Now let's see where those numbers came from. The empirical rule tells us that "about" 95% of values in a normal distribution land within 2 standard deviations of the mean. If we want to be more specific, it's really 1.96 standard deviations. Feel free to use 2 for quick calculation but we'll use 1.96 when a computer does the number crunching for us. This means our interval is calculated by:

The title of this documents mentions "confidence interval estimates". We need to cover three points before we get to finally compute a confidence interval estimate.

**First Point**: the formulas above use *π*. In the
real world we won't know this value. Think about the articles you read. If
they already KNEW the proportion of a population then they wouldn't do a
survey to estimate it. Therefore, we're going to swap *p̄*
for *π* in the formulas. That seems kind of like cheating
but it works under the right conditions.

**Second Point**: the conditions. 1) Your data needs to be a
random sample. 2) The size of your sample should be less than 10% of the
entire population. 3) There should be at least 5 respondents in your "yes"
group and at least 5 in your "no" group.

**Third Point**: confidence. Since 95% of all *p̄*'s
should land in the interval above, we say that we're "95% confident" that
*π* (which is unknown) is in the interval we computed.

Finally, here it is: a 95% confidence interval estimate for a population
parameter *π *is

A 95% level of confidence is the most common level used. However, if you
want a different level of confidence, you would use different values for
1.96 (which, by the way, is called a **critical value**). A
computer will usually compute these intervals for us but here's a short
table showing common confidence levels and the appropriate critical
values.

Confidence |
Critical Value |

90% |
1.645 |

95% |
1.960 |

99% |
2.576 |

Now let's go back to the original article. Of 1000 people, 46% met the
researchers' "yes". They put less than 5% of their annual income into
longer-term savings. Forty-six percent is a sample statistic, *p̄*,
not a population parameter. What does this tell us about ALL U.S. adults?

Based on this survey, we don't KNOW anything. However, based on this
survey *we are 95% confidence that somewhere between 42.9% and 49.1%
of US Adults put less than 5% of their annual income in longer-term
savings*.

Before we look at the other article examples, let's examine a spreadsheet that will do these calculations for us. Here's what it looks like.

Go back to the marijuana, race, and gluten examples. Before reading any further use the spreadsheet template to to compute 95% confidence intervals for each one.

**More on Margin of Error**

Let's look back at the confidence interval formula.

The statistic, *p̄*, is a **point estimate**.
It's the center of our interval. What we add to and subtract from *p̄*
is called the This is called the **margin of error**.

This fits the generic formula for a confidence interval estimate that we
started with: **point estimate + margin of error**.
Without the margin of error (and confidence level) you have no idea how
good your estimate is.

Let's stick with 95% confidence for an example. Suppose you're trying to
figure out whether or not Candidate Jones will win an election. It
initially sounds pretty good if a poll returns a point estimate of 56%
voting for Jones. However, if the margin of error is 12% then we don't
know much. All we can say is that *we're 95% confident that somewhere
between 44% and 68% will vote for Jones* and that interval is too
wide to tell us much. In contrast, if the margin of error was 3% then *we'd
be 95% confident that Jones will get between 53% and 59% of the vote*.
That's a pretty good estimate.

Anytime someone gives you a point estimate, you should ask for the margin of error. If you're reading an article about proportions that doesn't provide a margin or error, there's a formula for a quick approximation of the 95% margin of error: one divided by the square root of the sample size.

Let's check it for our savings article example. The properly calculated ME was .0309 and one divided by the square root of 1000 is .0316. The rough approximation is pretty close since both computations round to a 3% margin of error. Now try it for the other three examples.

We need to deal with the relationship between Margin of Error, Confidence Level, and Sample Size but first, let's see how things change when the data is numeric instead of categorical (nominal).

**Estimating Means**

Rather than the headlines, let's look at some recent history. We'll look at the Car Allowance Rebate System (a.k.a. "cash for clunkers") from 2009. Since it was a government funded program the data on each transaction is considered public data and was made available in Fall 2009. The initial release contained over 700,000 transactions. Later a revised set, along with a set of "cancelled" transactions, was released with over 677,000 transactions.

I've taken two random samples of around 500 transactions from Illinois, one sample from July 2009 sales and one from August 2009 sales. You can download the samples here.

Our goal is to estimate the mean MPG of all new cars purchased through
CARS in Illinois each month. Like any interval estimate, we have the form
**point estimate + margin of error**. Since we're
trying to estimate the population mean of ALL new cars,

If we continue to follow the logic from proportions we would get the following 95% confidence interval formulas:

If we wanted other confidence levels, we would change the 1.96 critical
value just as with proportions. However, we have a small problem. This
formula uses *sigma*, the population standard deviation. We aren't
likely to know this value in real life and we'd use the sample standard
deviation, *s*, as a substitute. With samples this large, we * could*
make the substitution of

Instead of following a normal distribution, the adjustment requires us to use something called "student's t-distribution". I can't give you a small table of critical values because they depend on both the confidence level and the sample size. We'll skip the details other than to show you what the formulas look like:

Yes, that's kind of ugly. Fortunately you won't have to know it very
well. The "t" stands for t-distribution. The "df" stands for "degrees of
freedom" and is always one less than your sample size. The *critical t*
is computed from the confidence level. We use 1-*α* for confidence.
Thus 95% confidence means *α* = 5% and *α*/2 is 2.5%. If
we were looking up these critical values in a table, then we would need to
use all of that information. Instead, the computer will find them for us.

Before we get to the computer, let's partially try it ourselves. From the
spreadsheet find the sample mean and sample standard deviation for new
vehicle MPG for July (this variable is the last column in the Excel file).
For 95% confidence and a sample this large, the *t* value is 1.964
(as I said, it's close to the same values you'd get for a normal
distribution). Now try to compute the 95% confidence interval.

Now you can go back to the spreadsheet you downloaded for proportions and use it for means too. You may have noticed before that there's a "Menu" sheet with links so that you've got templates for both nominal data (proportions) and ratio data.

Use this spreadsheet template to compute a 95% confidence interval for the mean new vehicle MPG in August and compare it to what we found for July.