Analysis of Variance

Lecture 15

Dave Brocker

Farmingdale State College

Analysis of Variance

ANOVA

ANOVA, like t-tests, compares the means of different groups to determine if they differ significantly from one another.

T-tests

IV must be what two things?
DV must be what one thing?

T-tests

IV must be nominal and have only 2 categories
DV must be continuous

AnOVA

ANOVA, like t-tests, compares the means of different groups to determine if they differ significantly from one another.

ANOVA can examine independent variables with more than 2 groups.
ANOVA can also be used for just 2 groups.
It’s more robust than t-tests.

ANOVA

Sampling Theory

We set the critical value so that 5% of the null distribution’s area is to the right of it.

ANOVA

There are 3 types of ANOVA:

Between subjects
Within subjects (repeated measures ANOVA)
Mixed designs (between and within together)

Between subjects ANOVA

The experimental group drinks caffeinated coffee. The placebo group drinks decaf. The control group drinks water. I ask all 3 groups to complete a survey to measure their happiness on a scale of 1 to 10.

The experimental group has a mean happiness score = 8
The placebo group mean happiness score = 6
The control group mean happiness score = 5
Is the group that drank coffee significantly happier than the other groups?

Between subjects ANOVA

If it’s all about the means, why is it called Analysis of Variance?

Between subjects ANOVA

To compare the means, we are analyzing two types of variance:
Variance among all of the scores (within group variance)
Variance between the groups (between group variance).

Between subjects ANOVA

Is the variance between the green, red, and blue dots greater than the variance across all of the dots?

Characteristic	Coffee N = 10¹	Control N = 10¹	Placebo N = 10¹
score	7.83(0.86)	4.89(2.12)	5.83(1.22)
¹ Mean(SD)

The F Statistic

ANOVA produces what we call the F statistic.
Why do we call it F?

Between subjects anova

Calculating F

Calculating the variance
ANOVA looks at variance. How do we calculate the Variance?

Calculating the variance

ANOVA looks at variance. To calculate the Variance:

Calculate the mean.
Subtract the mean from each x-value - what is this called?

Calculating the variance

ANOVA looks at variance. To calculate the Variance:

Calculate the mean.
Subtract the mean from each x-value (deviation score).
Subtract the mean from each x-value (deviation score).
Square the deviation scores.
Take the sum of the squared deviations - what is this called?

Calculating the variance

ANOVA looks at variance. To calculate the Variance:

Calculate the mean.
Subtract the mean from each x-value (deviation score).
Square the deviation scores.
Take the sum of the squared deviations (Sum of Squared Deviations).

Calculating the variance

ANOVA looks at variance. To calculate the Variance:

Calculate the mean.
Subtract the mean from each x-value (deviation score).
Square the deviation scores.
Take the sum of the squared deviations (Sum of Squared Deviations).
Divide by (n-1) - what is this called?

Calculating the variance in ANOVA

To calculate the Variance in ANOVA:

Calculate the mean.
Subtract the mean from each x-value (deviation score).
Square the deviation scores.
Take the sum of the squared deviations (Sum of Squared Deviations).
Divide by (degrees of freedom).

This is the variance, but in ANOVA we also call it a Mean Squared (MS).

Calculating means squared (MS)

Calculate the SS.
Divide the SS by the degrees of freedom.

Calculating means squared (MS)

We will calculate a SS for the between group variance.
How much variance in the data is from group differences?
We will calculate a SS for all the data.
How much variance in the data is from random error?

Between subjects ANOVA

SS_BG = How much variance comes comes from the group differences?
SS_Error = How much variance is there in total among all the data points?

Calculating means squared (MS)

Calculate the SS_BG.
Calculate the SS_Error.
Divide the SS by their degrees of freedom.

Degrees of freedom

The number of observations (data points) in the data that are free to vary when estimating a statistic.

I own 7 hats. I want to wear a different hat every day of the week.

On Monday, I have 7 hats to choose from.
On Tuesday, I have 6 hats to choose from.
On Wednesday, I have 5 hats to choose from.
On Thursday, I have 4 hats to choose from.
On Friday, I have 3 hats to choose from.
On Saturday, I have 2 hats to choose from.
On Sunday, I don’t get a choice. On Sunday, I have to wear the Santa hat.
The degrees of freedom is how many times I get a choice before I’m stuck with what’s left over.

Degrees of freedom

Between Groups (MS_BG ) = k - 1
Within Groups (MS_WG) = n - k
Total = n - 1
- k is the number of groups in the independent variable.

Calculating means squared (MS)

Calculate the SS_BG.
Calculate the SS_Error.
Divide the SS by their degrees of freedom.

Calculating the ms

Between Group Mean Squared: MS_BG = SS_BG / (k – 1)
Error Mean Squared: MS_Error = SS_Error / (n – k)

Calculating Between subjects anova

F = MS_BG / MS_Error

Calculating Between subjects anova

F = MS_BG / MS_Error

Between subjects anova Example

Prof Brocker gives one group of 100 participants the Super Secret Limitless Drug. He gives another group of 100 participants a placebo. He gives the third group of 100 participants nothing. All participants then complete an IQ test. She wants to know if those who took the SSLD has significantly higher IQ than the placebo and the control groups.

Between subjects anova Example

Prof Brocker recruits 500 participants. He shows half of them the Netflix Original, Dark and the other half Jeopardy. HHe then measures their happiness on a scale of 1 to 10. He wants to know if Dark participants are significantly happier than the Jeopardy participants.

Between subjects anova Example

Interpreting f

The MS_BG is made up of the MS_Error + the theoretical difference between groups:

MS_BG = group difference + MS_Error
MS_BG = 0 + MS_Error
MS_BG = MS_Error
F = MS_BG / MS_Error
F = 1

Interpreting f

When F =< 1, it’s not likely to be significant.

Interpreting f

The MS_BG is made up of the MS_Error + the theoretical difference between groups:

MS_BG = group difference + MS_Error
MS_BG = NUMBER GREATER THAN 0 + MS_Error
F = DIFFERENCE + MS_Error / MS_Error
F = DIFFERENCE

Hypothesis testing: ANOVA

F < = 1 —> Fail to reject the Null Hypothesis
F > 1 —> Refer to p-value —> Reject the Null Hypothesis

Reporting f

Reporting F

If asked to report findings in terms of the Null Hypothesis (H0), you

should report findings as:

Reject H0, or
Fail to Reject H0

Reporting F

If asked to report findings in general or for publication, you need to report 5 things:

F(df for the between group MS, df for the error MS)
F-value
p-value
Mean and standard deviation of each group

Reporting F

The group that watched Dark (M=8.43, s=1.02) reported significantly more happiness compared to their peers in the control group who watched Jeopardy (M=6.12, s=0.98), F(1,498) = 7.12, p < 0.05.

Reporting F

There was not a significant difference in happiness between the groups, F (1, 498) = 1.02, p = 0.07.