Two Factor Variables (Pt. 3)

Homer White, Georgetown College

In Part 3:

Load Packages

Always remember to make sure the necessary packages are loaded:

require(mosaic)
require(tigerstats)

Simpson's Paradox

The Death Penalty Data

data(deathpen)
View(deathpen)
help(deathpen)

For more information:

Michael J. Radelet: “Racial Characteristics and the Imposition of the Death Penalty”, American Sociological Review, 46 (1981).

Research Question:

Are black defendants more likely than white defendants to receive the death penalty?

Tables of Results

DefD <- xtabs(~defrace+death,data=deathpen)
DefD
       death
defrace  no yes
  black 149  17
  white 141  19
rowPerc(DefD)
         no   yes Total
black 89.76 10.24   100
white 88.12 11.88   100

Hmm — whites a bit more likely to get death!

Lurking Variable

There is a third variable: race of the victim:

str(deathpen)
'data.frame':   326 obs. of  3 variables:
 $ defrace: Factor w/ 2 levels "black","white": 2 2 2 2 2 2 2 2 2 2 ...
 $ vicrace: Factor w/ 2 levels "black","white": 2 2 2 2 2 2 2 2 2 2 ...
 $ death  : Factor w/ 2 levels "no","yes": 2 2 2 2 2 2 2 2 2 2 ...

If we take it into account, it may alter our interpretation of the results.

Subsetting

Let's break the data down by the values of the lurking variable vicrace:

deathpenWV <- subset(deathpen,
              vicrace=="white")
deathpenBV <- subset(deathpen,
              vicrace=="black")

When Victim is White

Two-way table:

DefDWV <- xtabs(
  ~defrace+death,
  data=deathpenWV)
DefDWV
       death
defrace  no yes
  black  52  11
  white 132  19

Then row percents:

rowPerc(DefDWV)
         no   yes Total
black 82.54 17.46   100
white 87.42 12.58   100

Black defendants are MORE likely to get death penalty!

When Victim is Black

Two-way table:

DefDWB <- xtabs(
  ~defrace+death,
  data=deathpenBV)
DefDWB
       death
defrace no yes
  black 97   6
  white  9   0

Then row percents:

rowPerc(DefDWB)
          no  yes Total
black  94.17 5.83   100
white 100.00 0.00   100

Black defendants are MORE likely to get death penalty here, too!

Bizarre!

  • When the victim is white, the black defendant is MORE likely to get the death penalty.
  • When the victim is black, the black defendant is MORE likely to get the death penalty.
  • But in the combined data, the black defendant is LESS likely to get the death penalty.

Simpson's Paradox

Simpson's Paradox occurs when a relationship between variables \( X \) and \( Y \) reverses its direction when the data are broken down by the values of a third variable \( Z \).

How Can It Happen?

  • Simpson's Paradox is mathematically possible.
  • But can we explain in a satisfying way why it occurs, when it occurs?

Stategy to Explain It

  • study how \( X \) relates to \( Z \)
  • study how Z relates to \( Y \)
  • try to synthesize the results of these two studies

Defendant Race and Victim Race

Two-way table:

DefVic <- xtabs(
  ~defrace+vicrace,
  data=deathpen)
DefVic
       vicrace
defrace black white
  black   103    63
  white     9   151

Then row percents:

rowPerc(DefVic)
      black white Total
black 62.05 37.95   100
white  5.62 94.38   100

People tend to kill people of their own race.

Victim Race and Outcome

Two-way table:

VicDeath <- xtabs(
  ~vicrace+death,
  data=deathpen)
VicDeath
       death
vicrace  no yes
  black 106   6
  white 184  30

Then row percents:

rowPerc(VicDeath)
         no   yes Total
black 94.64  5.36   100
white 85.98 14.02   100

The (mostly white) juries got really angry when a white person was killed!

Synthesis

Regardless of the race of the victim, white defendants are less likely to get the death penalty, but they “hamstring” themselves by killing mostly white victims — which is what got juries really mad, back in the day!