In this section, you will explore an analysis of the brain injury data. To experiment further with the project, you can access the data and extend your analysis.
The variables in the file are:
| sex (male or female) |
| sample identification |
| age of patient (grouped) |
| mental performance before injury |
| mental performance after injury |
| expected payout through litigation |
The following charts show boxplots and a histogram for the variable 'Amount', which represents the expected amount of damages.
The first graphic displays the boxplot for 'Amount' for each age group: Obviously, the amounts are numerically different between age groups. However, does it seem as if the expected payout varies significantly with age group?
The following chart shows the box plot for the 'Amount' variable, over all age groups. It seems reasonably symmetric, with one possible outlier.
The histogram for the variable 'Amount' seems reasonably close to the familiar shape of a bell curve.
From the following analysis of variance table, you can see that there is a difference in the expected payout amount depending on the value of the variable AgeGroup (see the F value and the associated probability of a greater F).
| Source | DF | Sum of Squares | F Value | Pr > F |
|---|---|---|---|---|
| Model | 4 | 4501415182.62 | 11.31 | 0.0001 |
| Error | 195 | 19400115925.98 | ||
| Corrected Total | 199 | 23901531108.60 |
| R-Square | C.V. | Amount: Mean |
|---|---|---|
| 0.188332 | 46.23527 | 21573.045 |
This confirms the first part of the original objective: to ascertain whether a difference exists. The next question is how does the change in the expected payout amount vary with the age of the plaintiff? What do you think? Will younger or older accident victims claim more damages? You can answer this question by performing an associated analysis:
The following analysis of means employs the Tukey HSD range test. The key to understanding the table is found in the first column: Means with the same letter are not significantly different.
| Tukey Grouping | Mean | N | AgeGroup |
|---|---|---|---|
| A | 26967 | 40 | 12-20 |
| A | 25358 | 40 | <12 |
| A | 23711 | 40 | 21-35 |
| B | 16421 | 40 | 35-60 |
| B | 15409 | 40 | >60 |
From this analysis, you can see that the younger plaintiffs seem to claim larger amounts of money in court (the groups labeled 'A'); The dividing line seems to be at 35 years old. Plaintiffs 35 years and older (the groups labeled 'B') have an expected payout significantly lower than the plaintiffs in the younger age group.
Consider performing a second analysis to determine whether there is a difference between the sexes in the expected payout amount.
Reflect, before your analysis, on what you would expect as the outcome. Would you expect there to be a difference in the amount in a court settlement depending on whether the accident victim is male or female?