13 Non-Parametric Tests
In this class we have learned the following, parametric tests, that is, tests that work when we have data on an interval or ratio scale:
- Z-test
- T-Test
- Single Sample
- Dependent Samples
- Independent Samples
- One Factor ANOVA
Between
Within
- Two Factor ANOVA
- Between
These tests all share something in common in that our results are estimates of aparameter, and they all are derived from a distribution. Additionally, these tests make assumptions about where the data comes from, i.e. normally distributed, equal variances etc.
What happens when our data does not meet these criteria? Do we throw our data out? No!
We use non-parametric tests. These tests are distribution-free. Think of these tests
as another version of what we have already learned.
13.0.1 Parametric Tests and Their Non-Parametric Equivalences
Instead of a Paired t-test, we can use a Wilcoxin Signed Ranks test.
Instead of a Pearson correlation, we can use a Spearman correlation.
Instead of an Independent T-Test, we can use a Mann-Whitney U test.
Instead of a One Way ANOVA, we can use a Kruskal Wallis Test instead.
Instead of a Two Way ANOVA, we can use a Friedman Test.
13.0.2 Chi-Square
One of the non-parametric tests that does not have a direct equivalence is the chi square test. For this lab you will only need to know about two.
The Chi Square Goodness of Fit test, and the Chi-Square test for independence.
13.0.2.1 Chi-Square Goodness of Fit
The Goodness of fit test compares whether observed distribution matches an expected distribution.
Imagine that we go out into a wealthy neighborhood and we count 81 Teslas, 50 Ferrariβs, and 27 Saturnβs.
Are these car makes equally common in this neighborhood?
If these car makes were equally common in wealthy neighborhoods, the proportion of them would be \(\frac{1}{3}\) each.
However, in this wealthy part of town, the breakdown should be:
\(\frac{1}{2} = Tesla\)
\(\frac{1}{3} = Ferrari\)
\(\frac{1}{6} = Saturn\)
Is there a significant difference between the observed frequencies and the expected frequencies?
In R
, we can use the chisq.test(x,p)
function where x
= a numeric vector, and p
represents probabilities of the same length as `x.
To see if the car makes are equally common in the neighborhood we would create our data:
13.0.2.1.1 Proper Reporting:
The proper way to report this would be as follows:
\(\chi^2(2) = 27.886, p <.01\)
Now we ask whether or not there is a difference in what we observed and what we expected.
Are these observed frequencies significantly different from the expected frequencies?
13.0.2.2 Formatting and Interpretation
\(\chi^2(2) p > .05\)
There is not a significant different in our observed frequencies and the expected frequencies.
13.0.3 Chi-Square Test of Independence
Similarly, we can use a Chi Square test to see if there is a significant difference in two different groups. For this we will need to make a table.
13.0.3.1 Formatting:
\(\chi^2(4) = 16.025, p <.01\)
13.0.4 Spearman Rho
Spearman Rho is another way of obtaining a correlation. Let us make a dataframe, plot the variables and calculate a correlation value:
13.0.4.1 Proper Reporting: Spearman Rho
rs(12) =.81, p < .05
13.0.5 Mann-Whitney U
We can use this test when comparing two dependent groups:
13.0.5.1 Proper Reporting: Mann-Whitney U
\(U_{obt} = 77, NS\)
13.0.6 Wilcoxin Signed Ranks Test
We can use the Wilcoxin Signed Ranks Test when comparing two dependent groups:
13.0.6.1 Proper Reporting:
\(V(12) = 15.5, p >.05\)
13.0.7 Kruskal Wallis Test
We can use the Kruskal Wallis Test when we want to test several independent variables.
Imagine we have the following data:
The problem with this data is that we cannot directly do an analysis because the data type is character. We can recode this like such:
13.0.7.1 Proper Formatting:
\(H(4) = 10.677, p = .013\)
13.0.8 Friedman Test
We can use a Friedman Test when we want to compare the effect of two or more independent variables on a dependent variable like such:
Remember: This test is designed to replicate the two-way ANOVA, just with different data types, because of this your data needs to look similar.
13.0.8.1 Proper Formatting
\(fr_{obt} (3), = .67,,N.S\)