Tests of Significance (Pt. 3)

Homer White, Georgetown College

In Part 3:

Load Packages

Always remember to make sure the necessary packages are loaded:

require(mosaic)
require(tigerstats)

Safety Checks

t-statistic Distribution

We are using the \( t \)-statistic to:

  • build confidence interval for \( \mu \), and
  • to get \( P \)-values in tests about \( \mu \).

Statistical Theory Says:

If

  • you sample \( n \) items randomly from a population, and
  • that population is normally distributed,

then the t-statistic

\[ t=\frac{\bar{x}-\mu}{SE(\bar{x})} \]

has a \( t(n-1) \)-distribution, with “degrees of freedom” equal to \( n-1 \).

Reminder about t-Curves

require(manipulate)
tExplore()

Notice: when sample size is large:

\[ t(n-1) \approx norm(0,1) \]

But ...

What if the population is not normally distributed?

(It hardly ever is.)

Well ...

If sample size \( n \) is large, \( t(n-1) \approx norm(0,1) \).

Also, CLT says

\[ \frac{\bar{x}-\mu}{SD(\bar{x})} \approx norm(0,1) \]

So

\[ t=\frac{\bar{x}-\mu}{SE(\bar{x})} \approx \frac{\bar{x}-\mu}{SD(\bar{x})} \approx norm(0,1) \]

Since:

  • \( t \approx norm(0,1) \), and
  • \( t(n-1) \approx norm(0,1) \),

we conclude that

  • \( t \approx t(n-1) \)

So if \( n \) is large, it's OK to use \( t \)-curves to:

  • make confidence intervals for \( \mu \)
  • approximate \( P \)-values in tests about \( \mu \)

… no matter what the population looks like!

But What if n is Not Large?

If sample size \( n \) is small and

  • population is skewed or
  • has an “outlier” group,

then there could be problems.

require(manipulate)
tSampler(~income,data=imagpop)

So ...

When \( n \) is not large (\( n < 30 \), say), we check the sample.

  • Make a histogram of the sample
  • or a density plot of the sample,
  • or a box-and-whisker plot of the sample.

If the sample shows skewness or outliers, then probably the population has these features, too.

Then ttestGC() might not be reliable.

Types of Error

Designed to be Wrong (Sometimes)

Recall:

95%-confidence intervals for \( \mu \) will fail to contain \( \mu \) about 5% of the time, in repeated sampling.

In general,

\( 100(1-\alpha)\% \)-confidence intervals will fail to contain \( \mu \) about \( 100\alpha\% \) of the time, in repeated sampling.

Designed to be Wrong

Test of significance do not always make the “right” decision, either!

Error Types

\( H_0 \) true \( H_0 \) false
Reject \( H_0 \) Type-I Error OK
Not reject \( H_0 \) OK Type-II Error
  • Type-I Error: Rejecting the Null, when it is actually true.
  • Type-II Error: Failing to reject the Null, when it is actually false.

Designed to be Wrong (Sometimes)

If

  • \( H_0 \) is actually true, and
  • your cut-off value \( \alpha \) is set at 0.05

then a trustworthy test of significance will commit a Type-I error about 5% of the time, in repeated sampling!

Designed to be Wrong (Sometimes)

In general, if

  • \( H_0 \) is actually true, and
  • your cut-off value is \( \alpha \)

then a trustworthy test of significance will commit a Type-I error about \( 100\alpha\% \) of the time, in repeated sampling!

Don't Believe it?

Then try this app:

require(manipulate)
Type12Errors()

Life is Hard ...

To cut down on the chance of a Type-I error:

  • set cut-off \( \alpha \) low.
  • But then if \( H_0 \) is false, Type-II errors become more likely!

To cut down on chance of Type-II errors:

  • set cut-off \( \alpha \) high.
  • But then if \( H_0 \) is true, Type-I errors become more likely!

... and then You Die

The only way to make

  • Type-I errors unlikely, and
  • Type-II errors unlikely

at the same time is to

  • set \( \alpha \) very low, and
  • take a really large sample!

Large samples are expensive and time-consuming.

The Criminal Trial Analogy

A test of significance is like a criminal trial in which only the prosecution presents evidence.

The Criminal Trial Analogy

Test Trial
Null Hypothesis Defendant's “Not Guilty” Plea
Alternative Hypothesis Prosecution: “He's Guilty”
Parameter (unknown) The Truth (we'll never know for sure)
Test Statistic Prosecutor's Evidence
P-value Prosecutor's Closing Argument
Decision about \( H_0 \) Jury's Verdict
Type-I Error Innocent Person Convicted
Type-II Error Guilty Person Acquitted