Confidence Intervals (Pt. 1)

Homer White, Georgetown College

In Part 1:

Load Packages

Always remember to make sure the necessary packages are loaded:


From Chance to Confidence

Before Flip, There are Chances

Say you are about to flip a fair coin.

  • It could turn out either Heads or Tails
  • \( P(\textbf{heads})=0.5 \)

After Flip, There is Confidence

Say you have already flipped the coin. It is hidden in your hand.

  • It could be Heads.
  • It could be Tails.
  • But there are no “chances” left.

But before the coin was flipped, it had a 50% chance to be Heads, so NOW

you are 50%-confident that it is Heads.

Before Sample: Chances

Suppose you are about take a simple random sample of “big enough” size \( n \) from a population.

68-95 Rule for Probability says:

There is about a 95% chance that \( \bar{x} \) will turn out to be within two \( SD(\bar{x}) \) of \( \mu \).

This means that:

There is about a 95% chance that \( \mu \) will lie within two \( SD(\bar{x}) \) of where \( \bar{x} \) turns out.

After Sample: Confidence

After the sampling, \( \mu \) either lies within two \( SD(\bar{x}) \) or it does not. (No “chances” left.)

But because of the chance beforehand:

You are about 95%-confident that \( \mu \) lies within two \( SD(\bar{x}) \) of the \( \bar{x} \) that you got.

After Sample: Confidence

Go out on a limb (replace \( SD(\bar{x}) \) with \( SE(\bar{x}) \)):

You are about 95%-confident that \( \mu \) lies within two \( SE(\bar{x}) \) of the \( \bar{x} \) that you got.

(This is part of the 68-95 Rule for Estimation.)

Intervals for One Mean

Fastest Speed Ever Driven


Focus on variable fastest.


\( \mu = \) mean fastest speed ever driven, for all GC students

Research Question

On average, how fast do GC students drive, when they drive their fastest?

Note that this question is equivalent to:

What is \( \mu \)?

A confidence interval for \( \mu \) will give a range of “reasonable” value of \( \mu \), based on the data at hand.

68-95 Rule for Estimation

Let's get a rough 95%-confidence interval for \( \mu \), using the 68-95 Rule:

  • We are about 95%-confident that \( \mu \) lies within two SE's of \( \bar{x} \).

Compute the Rough Interval

  mean    sd  n
 105.9 20.88 71

So \( \bar{x} = 105.9014 \).

We also need:

\[ SE(\bar{x})=\frac{s}{\sqrt{n}} \].

Compute the Rough Interval

So we use R as a calculator:

standError <- 20.8773/sqrt(71)
[1] 2.478

Compute the Rough Interval

Lower bound:

105.9014 - 2*standError
[1] 100.9

Upper bound:

105.9014 + 2*standError
[1] 110.9


“We are about 95%-confident that the mean fastest speed for all GC students is somewhere between 100.9 and 110.9 mph.”

Thought Question

  • Would it be reasonable for someone to believe, in the face of this data, that \( \mu=100 \)?
  • No, because:
    • the confidence interval is the range of values for \( \mu \) that are “reasonable”, based on the data, and
    • 100 lies outisde this interval!

A Not-So-Rough Interval

To get an interval that has closer to the desired level of confidence, we use ttestGC().

  • What is “t”?
  • What's this “test”?

We'll get to that later. For now, let's just do it!



The Output: Descriptive

Descriptive Results:

 variable  mean    sd  n
  fastest 105.9 20.88 71

The Output: Inferential

Inferential Results:

Estimate of mu:   105.9 
SE(   2.478 

More Inferential Output

95% Confidence Interval for mu:

lower.bound         upper.bound          
100.959833          110.842984

Not quite the same as the interval from the 68-95 Rule!

Other Confidence Levels

Want a 90%-confidence interval? No problem!


We get:

90% Confidence Interval for mu:

lower.bound         upper.bound          
101.771329          110.031488 

Confidence Level

  • The 90%-confidence interval was not as wide as the 95%-confidence interval
  • But we are less confident that \( \mu \) lies inside it!

What "95%-Confident" Means

If you sample repeatedly from the population and make a 95%-confidence interval each time, then

  • about 95% of the time, \( \mu \) will lie inside your interval
  • about 5% of the time, it won't!

(In other words, 95%-confidence interval are designed to be “wrong”, 5% of the time!)

Don't Believe It?

Try this app:


In Real Life ...

  • You take just one sample!
  • So you make just one 95%-confidence interval.
  • You don't know if \( \mu \) lies in the interval or not.
  • (But you are 95%-confident that \( \mu \) that it does!)

A Four-Step Procedure

  1. Define the parameter of interest
  2. Run the code, and perform a “safety check”
  3. Report the confidence interval and interpret it
  4. Use the interval to answer your Research Question

Safety Checks

  • Why?? (We'll learn later.)
  • They are the same as for the CLT to hold.
  • So for confidence interval for one mean \( \mu \), just need:
    • random sample from the population, AND
    • sample size \( n \geq \) 30.

If \( n < 30 \), check graph of sample for:

  • outliers
  • strong skewness

If you don't see these, you are probably OK.

Difference of Two Means

Research Question

Who drives faster, on average: GC males or GC females?

Step One: Parameters


\( \mu_1 \) = the mean fastest speed of all GC males.

\( \mu_2 \) = the mean fastest speed of all GC females.

We will make a confidence interval for \( \mu_1-\mu_2 \).

Step Two (1): Run Code

First, run the code:


(The argument first makes R treat populations in the same order you did when you defined the parameters.)

Step Two (2): Safety Check

Same as for CLT to hold:

  • Need to have taken two independent simple random samples from your two populations
    • (Also OK to have done a completely randomized experiment)
  • Both samples sizes must be at least 30
    • (If not, make graphs of sample(s), watching out for outliers and strong skewness)

Are We Safe?

Descriptive Results:

  group  mean    sd  n
   male 113.5 22.57 31
 female 100.0 17.61 40
  • both sample sizes are big enough
  • we (say) that these are like SRS's from the populations

So we are safe.

If You Had to ...

… you could graph the samples. Parallel boxplots are nice:

       main="Fastest Speed Driven, by Sex",
       ylab="speed (mph)")

plot of chunk unnamed-chunk-13

Step Three (1): Report Interval

Report it:

95% Confidence Interval for mu1-mu2:

lower.bound         upper.bound          
3.548586            23.254640

Step Three(2): Interpret the Interval

Based on the data at hand, we are 95%-confident that \( \mu_1-\mu_2 \) is somewhere between 3.55 and 23.25 mph.

Step Four: Conclusion

Use the interval to answer the Research Question:

If GC guys and GC gals drive equally fast on average, then

\[ \mu_1-\mu_2 =0 \]

  • But our confidence interval lies entirely above 0!
  • So we can be pretty sure that GC guys drive faster than GC gals, on average.

Mean of Differences

The Labels Experiment


Research Question:

Will people rate peanut butter more highly, on average, if it comes with a Jiff jar than if it comes in a Great Value jar?

Step One: Define Parameter


\( \mu_d = \) mean difference in rating (Jiff minus Great Value) for all Georgetown College students

Let's make a 95%-confidence interval for \( \mu_d \).

Step Two (1): The Code


Note the formula:

\[ \sim firstNumVar - secondNumVar \]

Step Two (2): Safety Check

Same as for CLT to hold:

  1. We need to have taken a SRS from the population;
  2. Sample size should be at least 30
    • If not, check graph of sample for skewness, outliers.

Are We Safe?

  1. We hope that the 30 subjects were like a simple random sample from the population.
  2. Sample size \( n \) is 30. Let's go ahead and check the sample for:
    • signs of strong skewness
    • extreme outliers

Making a Histogram

diff <- labels$jiffrating - labels$greatvaluerating
 min Q1 median Q3 max  mean   sd  n missing
  -5  1    2.5  4   8 2.367 2.81 30       0
  xlab="difference in ratings (jiff - gv)",
  main="Jiff vs. Great Value",

plot of chunk unnamed-chunk-18

A Judgement Call!

  • integer values (not continuous)
  • some left-skewness
  • maybe an outlier or two at smaller values

These could be a problem at smaller sample sizes, but probably OK here.

Time Out: Estimate and SE

Estimate of mu-d:   2.367 
SE(   0.513
  • We got \( \bar{d}=2.367 \).
  • This is more than 4 SEs above 0.
  • If labels make no difference (\( \mu_d=0 \)), these results would be very unusual!

Step Three (1): Report Interval

95%-confidence interval for \( \mu_d \) is:

lower         upper          
1.317442      3.415892 

Step Three (2): Interpret Interval

We are 95%-confident that the mean difference in ratings for all GC students (if they could all have been in this experiment), is somewhere between 1.3 and 3.4 points.

Step Four: Conclusion

  • If the label made no difference on average, then \( \mu_d \) would be 0.
  • But our interval lies above 0.
  • So we are pretty sure that GC students would rate Jiff-labeled peanut butter more highly than they rate GV-labeled peanut butter. (Even though they would tasting the same peanut butter!)

Want Less Output?

Sometimes you don't want quite so much output to the console. To get just the confidence interval:

set argument verbose to FALSE

For example:


Handling Summary Data

Summary From One Sample

Suppose that a random sample of size 36 is taken from a population. The sample mean is 50 and the sample SD is 4.

Find a 95% confidence interval for \( \mu \), the mean of the population.

The Code and Results


We get:

Inferential Procedures for One Mean mu:
95% Confidence Interval for mu:

lower.bound         upper.bound          
48.646595           51.353405

Summary From Two Samples

Suppose we have taken two indpendent random sample from two populations and we know:

Group \( \bar{x} \) \( s \) \( n \)
Group One 45 5 30
Group Two 40 6 20

Find a 95%-confidence interval for the difference in the means of the two populations.

Code and Results


We get:

Inferential Procedures for the Difference of Two Means mu1-mu2:
  (Welch's Approximation Used for Degrees of Freedom)
    Results from summary data.
95% Confidence Interval for mu1-mu2:

          lower.bound         upper.bound          
          1.707811            8.292189 


With summary data, you cannot check the samples for outliers and skewness!