Dispersion

Lecture 4

Dave Brocker

Farmingdale State College

Descriptive Statistics

The Old and the New

  • \(x\)

  • \(\bar{x}\)

  • \(\sum(x)\)

  • \(x^2\)

Descriptive Statistics

The Old and the New

  • \(x\) = refers to the value for 1 person (1 datapoint) in the sample

  • \(\bar{x}\) = refers to the mean or average value of X in the sample

  • \(\sum(x)\) = refers to the sum of all x values (sum of all the datapoints)

  • \(x^2\) = refers to the squared value of x (x multiplied by x)

Descriptive Statistics

Examples

  • data = {1,4,3,2,4,6}

  • \(x\) = \(x = 1 | 4 | 3 | 2 | 4 | 6\)

  • \(\bar{x}\) = \(3.33\bar{3}\)

  • \(\sum(x)\) = \(1 + 4 + 3 + 2 + 4 + 6 = 20\)

  • \(x^2\) = \(1 | 16 | 9 | 4 | 16 | 36\)

Descriptive Statistics

Describe the characteristics of a sample in terms of:

  • Central tendency

  • Dispersion (aka Variability aka Variance)

    • Dispersion: the action or process of distributing things or people over a wide area.

Central tendency: Mean

What is it, when should we use it, and how do we calculate it?

  • Mean
    • Average of all \(x\) values
    • Normal Distribution
    • \(\frac{\sum(x)}{n}\)

Central tendency: Median

What is it, when should we use it, and how do we calculate it?

  • Median
    • Middle value in sorted data
    • Skewed distributions
    • \(Med(X) = \frac{n + 1}{2}\)

Central tendency: Mode

What is it, when should we use it, and how do we calculate it?

  • Mode
    • Most common value
    • Bi|Tri-Modal Distributions
    • Most common value

Central Tendency

What do measures of Central Tendency tell you?

  • The mid-point in the data

  • The answer most of the participants gave.

Central Tendency

What do measures of Central Tendency tell you?

Central Tendency

What do measures of Central Tendency tell you?

Measures of Dispersion

What does dispersion mean?

  • Spread-out-ness

  • How much the participants’ responses differ from one another

  • Variance

Measures of Dispersion

Tells us about the spread-out-ness of the data, but what does that actually mean?

Measures of Dispersion

Measures of dispersion tell us about the spread-out-ness of the data, but what does that actually mean?

  • If I am measuring , music popularity what would low dispersion tell me?
  • If I am measuring music popularity, what would high dispersion tell me?

Measures of Dispersion

Measures of dispersion tell us about consensus.

  • Did participants give similar answers?

    • {1,2,2,1,2,2,3,2,1}
  • Did participants give wildly different answers?

    • {1,3,4,5,8,9,21,33}

Dispersion

Measures of dispersion tell us about consensus.

  • Data should have natural dispersion.

  • If everyone gives a similar answer, it’s harder to analyze difference.

Calculating Dispersion

How do you think we should calculate dispersion?

  • ON AVERAGE, how far is each X-VALUE from the MIDPOINT.

    • Calculate how far each X-VALUE is from the MIDPOINT.

    • Take the AVERAGE of those distances.

Calculating Dispersion

How do you think we should calculate dispersion?

In psychology, the mid-point we use will be the mean.

Calculating Dispersion

Acquire Data

Participant.IDX
11
23
32
42
53
62
71
83
91
102

Deviation Scores

Calculate how far each X-VALUE is from the MEAN.

Step 1. Examine Data

  • data = {1,3,2,2,3,2,1,3,1,2}

Step 2. Calculate Mean

  • \(\bar{x} = \frac{\sum(x)}{n} = 1 + 3 + 2...\)

  • \(\bar{x} = \frac{20}{10} = 2\)

Deviation Scores

Calculate how far each X-VALUE is from the MEAN.

Step 2: Subtract the mean from each X-value.

  • \(x - \bar{x}\)

  • \((1-2) + (3-2) + (2-2) + (2-2)...\)

  • \((-1) + 1 + 0 + 0...\)

Calculating Dispersion

Visual

ParticipantIDXx-M
11-1
231
320
420
531
620
71-1
831
91-1
1020

Calculating Dispersion

Visual

Calculating Dispersion

Visual

Calculating Dispersion

Visual

  • The average of the deviation scores will always = 0.

  • On average, the x-values = x.

  • So the average distance of x from x is 0.

Calculating Dispersion

How do you think we should calculate dispersion?

  1. Calculate the mean.
  2. Calculate the deviation scores.
  3. What can we do to the deviation scores to prevent them from adding up to 0?

What is the Mean of the Deviation Scores?

Stop Being So Negative!

  • The problem is the negative numbers!

  • How do we get rid of negative numbers

  • \((x-\bar{x})^2\)

Calculating Dispersion

Stop Being So Negative!

ParticipantIDXx-M(x-M)^2
11-11
2311
3200
4200
5311
6200
71-11
8311
91-11
10200

Calculating Dispersion

First 3 Steps

  • Step 1: Calculate the mean.

  • Step 2: Subtract the mean from each X-value.

  • Step 3: Square each deviation score.

Calculating Dispersion

Now what: Calculate the average squared deviation.

  • Step 4: Add up the Squared Deviations.

  • \(1+1+0+1...\)

Calculating Dispersion

Now what: Calculate the average squared deviation.

ParticipantIDXx-M(x-M)^2
11-11
2311
3200
4200
5311
6200
71-11
8311
91-11
10200

Calculating Dispersion

Divide by n-1 to Estimate

Step 5: Divide by (n-1).

  • (n-1) refers to the degrees of freedom… don’t worry about it for now.

  • \(\frac{\sum (x-\bar{x})^2}{n-1}\)

  • Variance

    • The fact or quality of being different, divergent, or inconsistent.

Calculating Dispersion

How do think we should calculate dispersion?

  • When we squared the deviation scores in Step 3, we inflated the deviation.

    • We have to undo that inflation.

    • How do you undo squaring?

    • We take the square root.

Calculating Dispersion

Inflation

ParticipantIDXx-M(x-M)^2
138.5-0.9020.813
249.29.79 95.8  
333.1-6.29 39.6  
444.85.43 29.5  
530.3-9.13 83.4  
640.51.1  1.22 

Calculating Dispersion

Step 6: Take the square root.

  • \(\frac{\sum(x-\bar{x})^2}{n-1} = \sqrt{.667} = .816\)

Calculating standard deviation

Step-by-Step

  1. Mean: Calculate the mean.

    • \(\bar{x}\)
  2. Deviation scores: Subtract the mean from each x-value.

    • \((x-\bar{x})\)
  3. Squared deviations: Square each deviation score.

    • \((x-\bar{x})^2\)
  4. Sum of squares: Add up the squared deviations.

    • \(\sum(x-\bar{x})^2\)

Calculating standard deviation

Step-by-Step

  1. Variance: Divide the sum of squares by (n-1).

    • \(\frac{\sum(x-\bar{x})^2}{n-1}\)
  2. Standard Deviation: Take the square root of the variance.

    • \(\sqrt{\frac{\sum(x-\bar{x})^2}{n-1}}\)

Formulas for Dispersion

Variance and Standard Deviation

Important

\[ s^2 = \frac{\sum(x-\bar{x})^2}{N-1} \]

Important

\[ s = \sqrt{\frac{\sum(x-\bar{x})^2}{N-1}} \]

Formulas for Dispersion

Standard Deviation

  • Variance is a measure of dispersion.

    • It’s not helpful, because it’s inflated (from squaring the deviation scores).
  • Standard deviation is a better measure of dispersion, because it is standardized.

    • This means that a standard deviation of 1 means a distance of 1 on the scale used to measure x.

    • HYPE scale: 1 to 10: standard deviation of 1 means the distance from one rating to the next, from 1 to 2.

    • If the mean = 5, SD = 1, most scores fall between 4 and 6.

Variance & Standard deviation

  • The Variance is the standard deviation squared: s²

  • The Standard Deviation is the square root of the variance: s