Homer White, Georgetown College

Always remember to make sure the necessary packages are loaded:

```
require(mosaic)
require(tigerstats)
```

We are using the \( t \)-statistic to:

- build confidence interval for \( \mu \), and
- to get \( P \)-values in tests about \( \mu \).

If

- you sample \( n \) items randomly from a population, and
- that population is normally distributed,

then the t-statistic

\[ t=\frac{\bar{x}-\mu}{SE(\bar{x})} \]

has a \( t(n-1) \)-distribution, with “degrees of freedom” equal to \( n-1 \).

```
require(manipulate)
tExplore()
```

Notice: when sample size is large:

\[ t(n-1) \approx norm(0,1) \]

What if the population is not normally distributed?

(It hardly ever is.)

If sample size \( n \) is large, \( t(n-1) \approx norm(0,1) \).

Also, CLT says

\[ \frac{\bar{x}-\mu}{SD(\bar{x})} \approx norm(0,1) \]

So

\[ t=\frac{\bar{x}-\mu}{SE(\bar{x})} \approx \frac{\bar{x}-\mu}{SD(\bar{x})} \approx norm(0,1) \]

Since:

- \( t \approx norm(0,1) \), and
- \( t(n-1) \approx norm(0,1) \),

we conclude that

- \( t \approx t(n-1) \)

So if \( n \) is large, it's OK to use \( t \)-curves to:

- make confidence intervals for \( \mu \)
- approximate \( P \)-values in tests about \( \mu \)

… no matter what the population looks like!

If sample size \( n \) is small and

- population is skewed or
- has an “outlier” group,

then there could be problems.

```
require(manipulate)
tSampler(~income,data=imagpop)
```

When \( n \) is not large (\( n < 30 \), say), we check the sample.

- Make a histogram of the sample
- or a density plot of the sample,
- or a box-and-whisker plot of the sample.

If the sample shows skewness or outliers, then probably the *population* has these features, too.

Then `ttestGC()`

might not be reliable.

Recall:

95%-confidence intervals for \( \mu \) will fail to contain \( \mu \) about 5% of the time, in repeated sampling.

In general,

\( 100(1-\alpha)\% \)-confidence intervals will fail to contain \( \mu \) about \( 100\alpha\% \) of the time, in repeated sampling.

Test of significance do not always make the “right” decision, either!

\( H_0 \) true | \( H_0 \) false | |
---|---|---|

Reject \( H_0 \) | Type-I Error |
OK |

Not reject \( H_0 \) | OK | Type-II Error |

- Type-I Error: Rejecting the Null, when it is actually true.
- Type-II Error: Failing to reject the Null, when it is actually false.

If

- \( H_0 \) is actually true, and
- your cut-off value \( \alpha \) is set at 0.05

then a trustworthy test of significance will commit a Type-I error about 5% of the time, in repeated sampling!

In general, if

- \( H_0 \) is actually true, and
- your cut-off value is \( \alpha \)

then a trustworthy test of significance will commit a Type-I error about \( 100\alpha\% \) of the time, in repeated sampling!

Then try this app:

```
require(manipulate)
Type12Errors()
```

To cut down on the chance of a Type-I error:

- set cut-off \( \alpha \) low.
- But then if \( H_0 \) is false, Type-II errors become more likely!

To cut down on chance of Type-II errors:

- set cut-off \( \alpha \) high.
- But then if \( H_0 \) is true, Type-I errors become more likely!

The only way to make

- Type-I errors unlikely, and
- Type-II errors unlikely

at the same time is to

- set \( \alpha \) very low, and
- take a really large sample!

Large samples are expensive and time-consuming.

A test of significance is like a criminal trial in which only the prosecution presents evidence.

Test | Trial |
---|---|

Null Hypothesis | Defendant's “Not Guilty” Plea |

Alternative Hypothesis | Prosecution: “He's Guilty” |

Parameter (unknown) | The Truth (we'll never know for sure) |

Test Statistic | Prosecutor's Evidence |

P-value | Prosecutor's Closing Argument |

Decision about \( H_0 \) | Jury's Verdict |

Type-I Error | Innocent Person Convicted |

Type-II Error | Guilty Person Acquitted |