Academic Resource Center

Center, Shape and Spread

Updated on

What are Descriptive Statistics?

This guide is focused on foundational concepts needed for Applied Statistics as well as Intro to Statistics and assumes a general knowledge of how to navigate and use equations in excel. See the Microsoft office tech tutorial links for excel tutorials.

Descriptive statistics, including central tendency (center), skewness (shape), and variability (spread), are foundational concepts. These descriptors are used frequently when first approaching a data set and will come up in assignments for Applied Statistics.

Measures of Center covers the mean, median, and mode and includes the necessary equations for finding each, defined symbols, steps for solving, an example, and the corresponding excel equation. For an introduction to central tendency, see this guide.

The Shape section defines symmetric and skewed graphs with a visual representation of each.

The Spread, or variability, section defines variance, standard deviation, z-scores, range, quartiles, inter-quartile range (IQR), and box-plots. This section also includes any necessary equations. For more information about box-plots and the five number summary, watch this video.

Measures of Center

Mean

The average

Symbols:

μ = Population Mean

x̄ = Sample Mean

xi = A data point

∑ = Sum of all

n =Sample Size (Number of data points)

Equation:

Center, Shape, Spread.pdf - Google Drive - Google Chrome
  1. Add all data points together
  2. Divide by the number of data points

Example:

Find the average test score of this class of 10 students:

90 82 98 76 80 81 78 90 93 100
  1. 90 + 82 + 98 + 76 + 80 + 81 + 78 + 90 + 93 + 100 = 868
  2. 868/10 = 86.8

Excel Equation: =average(array)

“Array” means highlight your data. You can highlight data in excel by selecting the first cell in your data list and holding and dragging your mouse down to the end of the data list

Median

The middle value in a sorted set (including repeat numbers)

For an odd data set: Use the middle value

For an even data set: Take the average of the two middle values

How to find the median:

  1. Sort the data
  2. The middle value = (n+1)/2

Example:

Find the middle value for a data set of 25 values

(25+1)/2 = 26/2 = 13

The median is the 13th data value

Find the middle value for a data set of 26 values

(26+1)/2 = 27/2 = 13.5

The median is the average of the 13th data value and the 14th data value

Excel Equation: =median(array)

“Array” means highlight your data. You can highlight data in excel by selecting the first cell in your data list and holding and dragging your mouse down to the end of the data list.

Mode

The most frequent occurring value(s)

There can be more than one mode, or no mode

Typically used for categorical data

How to find the mode:

  1. Sort the data
  2. Make a frequency table

Example:

Find the mode for the following data set

1—brown eyes

2—blue eyes

3—other

2 1 1 3 2 1 3 1 1 2

1.    Sort the data

1 1 1 1 1 2 2 2 3 3

2.    Frequency table

Eye color Frequency
1-brown 5
2-blue 3
3-other 2

The mode is 1—brown eyes.

Excel Equation: =mode.mult(array)

“Array” means highlight your data. You can highlight data in excel by selecting the first cell in your data list and holding and dragging your mouse down to the end of the data list.

Shape

Symmetric

Normal distribution

“Bell curve”

Use the mean (average)

 

Center, Shape, Spread.pdf - Google Drive - Google Chrome

Skewed

Outliers pull the tail out

Use the median

Left skew:

Long left tail

x̄ < median (the mean is less than the median)

Center, Shape, Spread.pdf - Google Drive - Google Chrome

Right skew:

Long right tail

x̄ > median (the mean is greater than the median)

Center, Shape, Spread.pdf - Google Drive - Google Chrome

Spread

Variance

How far on average is each data point from the mean—variability

σ2 = Population Variance

s2 = Sample Variance

Excel Equation for Population: =VAR.P(array)

“Array” means highlight your data

Excel Equation for Sample: =VAR.S(array)

“Array” means highlight your data

Standard Deviation

A measure of the dispersion of a data set

σ = Population SD

s = Sample SD

Excel Equation for Population=STDEV.P(array)

“Array” means highlight your data

Excel Equation for Sample: =STDEV.S(array)

“Array” means highlight your data

Z-Score

How many standard deviations a value (x) is from the mean

Center, Shape, Spread.pdf - Google Drive - Google Chrome

x = Data value

x̄ = Mean

s = Standard Deviation

Excel Equation=STANDARDIZE(data value, mean, standard deviation)

Range

How far the lowest data value is from the highest data value

Subtract the minimum value from the maximum value

max — min

Quartiles

25% of the data

Center, Shape, Spread.pdf - Google Drive - Google Chrome

Excel equation for Quartiles: =quartile(array, quartile)

Quartile values for excel:

0—gives minimum value

1—gives quartile 1

2—gives quartile 2 or median

3—gives quartile 3

4—gives maximum value

Interquartile Range (IQR)

How far quartile 1 is from quartile 3

Used in skewed distributions

Q3 - Q1

Box Plot

Center, Shape, Spread.pdf - Google Drive - Google Chrome

Next  Steps

Now that you know all about descriptive statistics, you’re ready to start analyzing data sets! These concepts will be used all throughout your statistics course to describe data and make inferences.

Next, it’s time to learn about inferential statistics, starting with linear regression.

Need More Help?

Click here to schedule a 1:1 with a tutor, coach, and or sign up for a workshop. *If this link does not bring you directly to our platform, please use our direct link to "Academic Support" from any Brightspace course at the top of the navigation bar.   

Previous Article Statistics and Probability Symbols
Next Article Fitted Models
Have a suggestion or a request? Share it with us!