Lecture 10
Farmingdale State College
Sample Mean:
If we took infinite samples of a population, the mean of each sample taken would be a Sample Mean – like how each student collect 500 responses about TV show ratings and took the average of those 500 responses.
Sampling Distribution:
What we get if we put all the sample means together; we assume it’s a normal distribution.
Each value in this distribution represents the average of 1 sample, a Sample Mean.
There are about 10,000 students at Farmingdale State College.
Each of the 25 of us recruits a sample of 400 students.
We ask every single FSC student to rate their sense of belonging on FSC campus on a scale of 1 (I don’t belong at all) to 10 (I belong completely).
We each calculate the average response from our own sample of 400.
Professor Brocker gives
100
students caffeinated coffee and another100
students decaf. He then has them complete a stats exam.
IV:
DV:
N =
Professor Brocker gives
100
students caffeinated coffee and another100
students decaf. He then has them complete a stats exam.
IV: Coffee Consumption (Coffee or No Coffee)
DV: Exam Scores
N = 200
The p-value tells us if the mean of the experimental group is far enough away from the control group mean that we can be confidence it belongs to a theoretical non-null distribution.
If there is a significant difference between the groups, p will be smaller than 0.05
.
If p < (less than) 0.05
, the difference is significant, we Reject \(H_0\).
If p > (greater than) 0.05
, the difference is NOT significant, we Fail to Reject \(H_0\).
How often are we okay with making a mistake?
Type 1:
Reality: There is NO difference between the groups.
Conclusion: There is a difference between the groups.
Type 2:
Reality: There is a difference between the groups.
Conclusion: There is NO difference between the groups.
How often are we okay with making a mistake?
We accept a 5%
chance of committing a Type 1 Error.
5%
or 0.05
We set alpha at 5%
(0.05)
, meaning we are okay with making a Type 1 Error (false positive) 5%
of the time.
As a result, beta gets set at 16%
, meaning we have a 16% chance of committing Type 2 Error (false negative).
Alpha (\(\alpha\)) = probability of committing Type 1 Error
Beta (\(\beta\))= probability of committing Type 2 Error
\(H_0\) True | \(H_0\) False | |
---|---|---|
Fail to Reject \(H_0\) | Correct Decision \(1-\alpha\) | Incorrect Decision Type II Error \(\beta\) |
Reject \(H_0\) | Incorrect Decision Type I Error \(\alpha\) |
Correct Decision \(1-\beta\) |
States that nothing will happen while also naming of the variables (independent and dependent).
A testable prediction of what will happen in our experiment that names of the variables (independent and dependent) and clearly contrasts the groups.
Null Hypothesis (\(H_0\)): These is no difference in the DV between the IV groups.
Alternative Hypothesis (\(H_1\)): The experimental group is significantly different from the control group on the DV.
If there is a significant difference between the groups, p will be smaller than 0.05
.
If p = .032…
If p = .045…
If p = .050…
If p < (less than) 0.05
, the difference is significant, we Reject \(H_0\).
If p > (greater than) 0.05
, the difference is NOT significant, we Fail to Reject \(H_0\).
Alpha: The probability of committing a Type 1 Error.
0.05
Beta: The probability of committing a Type 2 Error.
0.05
, the resulting \(\beta\) = 0.16
Reality: There is NO difference between the groups.
Conclusion: There is a difference between the groups.
Reality: There is a difference between the groups.
Conclusion: There is NO difference between the groups.
Alpha: The probability of committing a Type 1 Error.
0.05
Beta: The probability of committing a Type 2 Error.
0.05
, the resulting \(\beta\) = 0.16
Power: Ability of the researcher to accurately detect a difference between groups
0.05
, \(\beta\) = 0.16
, and the resulting power = 0.84
Power refers to the ability of the researcher to accurately detect a difference between groups.
When we assume a normal distribution, we assume power will be about 0.84
or 84%
.
Power is dependent on Effect Size.
Effect size in statistics refers to the strength of the relationship between two variables in a population.
Does caffeine decrease the amount of time it takes to solve a puzzle?
Does caffeine decrease the amount of time it takes to solve a puzzle?
Caffeine versus no caffeine
Task speed
Does caffeine decrease the amount of time it takes to solve a puzzle?
Strength of the relationships between caffeine and task speed
Does caffeine decrease the amount of time it takes to solve a puzzle?
The strength of the relationship between two variables in a population.
Effect size tells us how much one variable actually impacts the other.
Effect size in statistics refers to the strength of the relationship between two variables in a population.
Cohen’s d gives us a standardized measure of effect size:
\(d < 0.3\) is weak
\(0.3 < d > 0.5\) is moderate
\(d > 0.7\) is strong
Cohen’s d is calculated by subtracting the mean of the experimental group from the mean of the control group and dividing it by the “pooled standard deviation.”
The pooled standard deviation refers to the average SD across the 2 groups.
\(d = \frac{M_2-M_1}{\sqrt{\frac{SD_1^2\ +\ SD_2^2}{2}}}\)
\(d = \frac{M_2-M_1}{\sqrt{\frac{SD_1^2\ +\ SD_2^2}{2}}}\)
\(d = \frac{\text{Control Group Mean}-\text{Experimental Group Mean}}{\sqrt{\text{Pooled Variance}}}\)
In psychology, we are often dealing with effect sizes that are small (d = 0.3)
.
When the effect size is small, the hypothetical alternative distribution is closer to the null distribution.
What can we do to increase power when an effect size is small?
Take bigger samples (as per Central Limit Theorem)
Design better experiments with more control and fewer confounds
⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢⬡⬢