13 Non-Parametric Tests

In this class we have learned the following, parametric tests, that is, tests that work when we have data on an interval or ratio scale:

Z-test
T-Test
- Single Sample
- Dependent Samples
- Independent Samples
One Factor ANOVA
- Between
- Within
Two Factor ANOVA
- Between

These tests all share something in common in that our results are estimates of aparameter, and they all are derived from a distribution. Additionally, these tests make assumptions about where the data comes from, i.e. normally distributed, equal variances etc.

What happens when our data does not meet these criteria? Do we throw our data out? No!

We use non-parametric tests. These tests are distribution-free. Think of these tests

as another version of what we have already learned.

13.0.1 Parametric Tests and Their Non-Parametric Equivalences

Instead of a Paired t-test, we can use a Wilcoxin Signed Ranks test.
Instead of a Pearson correlation, we can use a Spearman correlation.
Instead of an Independent T-Test, we can use a Mann-Whitney U test.
Instead of a One Way ANOVA, we can use a Kruskal Wallis Test instead.
Instead of a Two Way ANOVA, we can use a Friedman Test.

13.0.2 Chi-Square

One of the non-parametric tests that does not have a direct equivalence is the chi square test. For this lab you will only need to know about two.

The Chi Square Goodness of Fit test, and the Chi-Square test for independence.

13.0.2.1 Chi-Square Goodness of Fit

The Goodness of fit test compares whether observed distribution matches an expected distribution.

Imagine that we go out into a wealthy neighborhood and we count 81 Teslas, 50 Ferrari’s, and 27 Saturn’s.

Are these car makes equally common in this neighborhood?

If these car makes were equally common in wealthy neighborhoods, the proportion of them would be \(\frac{1}{3}\) each.

However, in this wealthy part of town, the breakdown should be:

\(\frac{1}{2} = Tesla\)
\(\frac{1}{3} = Ferrari\)
\(\frac{1}{6} = Saturn\)

Is there a significant difference between the observed frequencies and the expected frequencies?

In R, we can use the chisq.test(x,p) function where x = a numeric vector, and p represents probabilities of the same length as `x.

To see if the car makes are equally common in the neighborhood we would create our data:

13.0.2.1.1 Proper Reporting:

The proper way to report this would be as follows:

\(\chi^2(2) = 27.886, p <.01\)

Now we ask whether or not there is a difference in what we observed and what we expected.

Are these observed frequencies significantly different from the expected frequencies?

13.0.2.2 Formatting and Interpretation

\(\chi^2(2) p > .05\)

There is not a significant different in our observed frequencies and the expected frequencies.

13.0.3 Chi-Square Test of Independence

Similarly, we can use a Chi Square test to see if there is a significant difference in two different groups. For this we will need to make a table.

13.0.3.1 Formatting:

\(\chi^2(4) = 16.025, p <.01\)

13.0.4 Spearman Rho

Spearman Rho is another way of obtaining a correlation. Let us make a dataframe, plot the variables and calculate a correlation value:

13.0.4.1 Proper Reporting: Spearman Rho

r_s(12) =.81, p < .05

13.0.5 Mann-Whitney U

We can use this test when comparing two dependent groups:

13.0.5.1 Proper Reporting: Mann-Whitney U

\(U_{obt} = 77, NS\)

13.0.6 Wilcoxin Signed Ranks Test

We can use the Wilcoxin Signed Ranks Test when comparing two dependent groups:

13.0.6.1 Proper Reporting:

\(V(12) = 15.5, p >.05\)

13.0.7 Kruskal Wallis Test

We can use the Kruskal Wallis Test when we want to test several independent variables.

Imagine we have the following data:

The problem with this data is that we cannot directly do an analysis because the data type is character. We can recode this like such:

13.0.7.1 Proper Formatting:

\(H(4) = 10.677, p = .013\)

13.0.8 Friedman Test

We can use a Friedman Test when we want to compare the effect of two or more independent variables on a dependent variable like such:

Remember: This test is designed to replicate the two-way ANOVA, just with different data types, because of this your data needs to look similar.

13.0.8.1 Proper Formatting

\(fr_{obt} (3), = .67,,N.S\)