ACCACIMAICAEWAATManagement Accounting

Describing Data: Averages, Spread, and Expected Values

AccountingBody Editorial Team

Learning objectives

By the end of this chapter, you should be able to:

Calculate and interpret measures of central tendency (mean, median, mode) for different types of data.
Calculate and interpret measures of dispersion (range, variance, standard deviation) to assess variability.
Use the coefficient of variation to compare relative variability between datasets.
Construct and interpret simple frequency distributions to summarise data and identify patterns.
Calculate expected values to support decisions under uncertainty, and explain what the result does (and does not) tell you.

Overview & key concepts

Managers and analysts often face large volumes of data—daily sales, processing times, defect rates, customer spend, or budget variances. Descriptive statistics convert raw numbers into a compact summary that helps you:

understand what a “typical” value looks like,
judge how consistent (or volatile) the data are,
compare performance across products, sites, or periods,
support decisions where outcomes are uncertain.

Two ideas run throughout this chapter:

Central tendencyanswers:Where is the data typically located?
Dispersionanswers:How widely does the data vary around that typical value?

A separate but related tool, expected value, summarises uncertain outcomes using probabilities.

Measures of central tendency

Measures of central tendency describe a representative value for a dataset. The most appropriate measure depends on the shape of the data and what you are trying to communicate.

Mean (arithmetic average)

The mean is the total of all values divided by the number of values:

Mean= (sum of values) ÷ (number of values)

The mean uses every data point, which makes it useful for budgeting, forecasting, and performance measurement. However, it can be pulled up or down by extreme values (outliers).

Exam tip: If a dataset contains an extreme value, comment on whether the mean still gives a representative “typical” figure, and consider the median as a comparison.

Median

The median is the middle value when the data are placed in order. If there is an even number of values, the median is the average of the two central values.

The median is often preferred when the dataset is skewed or contains outliers, because it depends on position rather than magnitude.

Exam tip: Always sort the data before selecting the median. State clearly whether you have an odd or even number of observations.

Mode

The mode is the most frequently occurring value. Some datasets have one mode, multiple modes, or no mode.

The mode is particularly useful where the most common outcome is more informative than an average (for example, the most common order size or the most common waiting time band).

Exam tip: If every value occurs once, state explicitly that there is no mode (rather than leaving it blank).

Measures of dispersion

Dispersion shows how spread out the data are. Two datasets can share the same mean but have very different consistency—an important distinction in performance control and risk assessment.

Range

The range is:

Range= maximum value − minimum value

It is quick to calculate, but it only uses two observations and can be distorted by outliers.

Exam tip: Use range as a quick first comment on spread, but support it with standard deviation when the question asks for a fuller measure of variability.

Variance and standard deviation

Variance and standard deviation measure spread around the mean.

Calculate each deviation from the mean: (value − mean).
Square each deviation.
Sum the squared deviations.
Divide to obtain the variance.
Take the square root to obtain the standard deviation.

There are two common versions:

Population variance(use when the dataset is the full set you are analysing):
- Variance = Σ(x − mean)² ÷ n
Sample variance(use when the dataset is a sample used to estimate a wider population):
- Variance = Σ(x − mean)² ÷ (n − 1)

Exam tip: Unless the question indicates you are sampling or estimating a wider population, treat the dataset as the full period under review and divide by n. Use (n − 1) only when sampling/estimation is clearly intended.

Interpreting standard deviation: Standard deviation is a typical distance from the mean. It is not a guarantee that most values fall within one standard deviation of the mean unless you make additional distribution assumptions.

Coefficient of variation

The coefficient of variation (CV) compares variability relative to the mean:

CV= (standard deviation ÷ mean) × 100%

CV is helpful when comparing datasets with different average levels (for example, two products with different average demand).

Exam tip: CV is most meaningful when the mean is positive and not close to zero. If the mean is very small (or negative), explain why CV may be misleading.

Frequency distributions

A frequency distribution groups data into intervals (classes) and counts how many observations fall into each interval. It helps you see clustering, gaps, and potential skewness.

Good class design is:

complete(all observations included),
non-overlapping(no value fits two classes),
consistent(class widths usually equal unless there is a clear reason otherwise),
clear at boundaries(so values are allocated unambiguously).

Exam tip: Where possible, use equal-width intervals. Open-ended classes (such as “£180+”) can be acceptable for quick summaries, but they are less useful for charts and comparisons because the class width is not defined.

Expected values

An expected value (EV) summarises uncertain outcomes as a probability-weighted average:

Expected value= Σ (outcome × probability)

For an EV calculation to be valid, the probabilities used must represent all outcomes and must total 1.0.

EV is a useful decision aid, but it has important limits:

EV isnotthe most likely outcome.
The EV may be a value thatnever actually occurs(for example, if outcomes are £0 or £300, the EV could be £150).
EV is most informative when the decision is repeatable over time or when using a risk-neutral decision rule. Otherwise, downside risk and risk appetite must also be discussed.

Exam tip: After calculating EV, add a short comment on risk: the downside outcome, its probability, and whether the decision maker might still reject the option despite a positive EV.

Core theory and frameworks

Data summary workflow

When you are given a dataset, aim to build a short story from the numbers rather than listing calculations.

1) Sanity-check first
Confirm what each figure represents (units, time period, and whether any values look like errors or one-off events).

2) Pick a headline “typical” value

Use themeanwhen you need an overall level that reflects all observations.
Use themedianwhen extremes could distort the picture.
Use themodewhen the most common outcome is the key point (especially for categories or standard order sizes).

3) Add spread to show consistency
Start with the range for a quick sense-check, then use standard deviation for a fuller measure of variability around the mean. State whether you are using the population or sample approach.

4) Compare datasets fairly (if required)
If average levels differ significantly, use CV to compare variability relative to the mean (and note any limitations if the mean is very small or negative).

5) Show the shape, not just the average
Use a frequency distribution to highlight clustering, gaps, and potential skewness—then explain what the shape suggests for control, forecasting, or capacity planning.

6) If outcomes are uncertain, separate “average outcome” from “risk”
Compute EV as a probability-weighted average (checking probabilities total 1), then discuss downside exposure rather than treating EV as a guaranteed result.

Worked example

Narrative scenario

A small retail company tracks its daily sales over a week to understand typical performance and variability. The sales figures (GBP) are:

£120, £150, £130, £180, £170, £160, £140.

The company wants to summarise the sales data using measures of central tendency and dispersion, build a simple frequency distribution, and calculate the expected value for a potential promotional offer.

Required

Calculate the mean, median, and mode of the sales data.
Determine the range, variance, and standard deviation.
Build a simple frequency distribution.
Calculate the expected value for a promotional offer with given probabilities.

Solution

1) Mean, median, and mode

Mean

Sum of sales = 120 + 150 + 130 + 180 + 170 + 160 + 140 = 1,050
Number of days, n = 7

Mean = 1,050 ÷ 7 = £150

Median

Sorted data: 120, 130, 140, 150, 160, 170, 180
Median (middle value) = £150

Mode

All values occur once, so there is no mode.

2) Range, variance, and standard deviation

Range

Maximum = 180, Minimum = 120
Range = 180 − 120 = £60

Variance and standard deviation

Mean = £150

Day’s sales (x)	x − mean	(x − mean)²
120	−30	900
150	0	0
130	−20	400
180	30	900
170	20	400
160	10	100
140	−10	100
Total	-	2,800

Σ(x − mean)² = 2,800

Population variance (treating this week as the full period under review):
Variance = 2,800 ÷ 7 = 400
Standard deviation = √400 = £20.00

Sample variance (treating this week as a sample used to estimate a wider pattern):
Variance = 2,800 ÷ (7 − 1) = 2,800 ÷ 6 = 466.67
Standard deviation = √466.67 ≈ £21.60

3) Frequency distribution

Use equal-width intervals and ensure all values fit exactly one class. With a minimum of £120 and maximum of £180, a convenient class width is £20:

£120–£139
£140–£159
£160–£179
£180–£199

Count the observations:

£120–£139: 120, 130 →2
£140–£159: 140, 150 →2
£160–£179: 160, 170 →2
£180–£199: 180 →1

Frequency table

Sales interval	Frequency
£120–£139	2
£140–£159	2
£160–£179	2
£180–£199	1
Total	7

4) Expected value

Promotional offer outcomes:

0.5 probability of£200profit
0.5 probability of£100profit

Probabilities total: 0.5 + 0.5 = 1.0 (complete set of outcomes)

EV = (0.5 × 200) + (0.5 × 100)
EV = 100 + 50 = £150

Interpretation of the results

Typical daily sales:Mean and median are both£150, suggesting £150 is a sensible headline figure for this week’s “typical” day.
Variability:Arange of £60indicates noticeable movement across the week. Astandard deviation of about £20(population method) indicates that daily sales are typically about £20 away from the mean. This does not guarantee that most days fall within ±£20 without further assumptions about the distribution.
Pattern from the frequency distribution:Sales are spread fairly evenly across the middle bands, with one day in the highest band (£180–£199).
Decision support under uncertainty:The promotional offer has anexpected profit of £150. This is an average across outcomes, not the most likely outcome and not a guaranteed result. A risk-aware comment should note the downside outcome (£100) and its probability (0.5).

Common pitfalls and misunderstandings

Mean vs median:Using the mean for a dataset with extreme values can produce a “typical” figure that few periods achieve.
Forgetting to sort for the median:The median depends on ordered data; the middle of an unsorted list is meaningless.
Assuming a mode must exist:Many datasets have no single most common value.
Range over-reliance:Range can exaggerate variability if one extreme value is unusual; it ignores the rest of the data.
Mixing population and sample formulas:Dividing by n in one step and by (n − 1) in another produces inconsistent results.
Using CV mechanically:CV is not helpful when the mean is zero/near zero or where negative values occur.
Weak class design in frequency tables:Overlapping classes, gaps, or unclear boundaries lead to incorrect counts and unreliable conclusions.
Expected value without risk discussion:EV is a weighted average; it does not describe downside exposure or outcome volatility.
Probabilities not totalling 1:EV calculations require a complete set of outcomes; probabilities must sum to 1.
Units confusion:Variance is in squared units; standard deviation returns to the original units.

Summary and further reading

Descriptive statistics convert raw data into useful information. Measures of central tendency (mean, median, mode) describe typical values, while measures of dispersion (range, variance, standard deviation) quantify variability. The coefficient of variation supports comparisons of relative volatility across datasets with different average levels. Frequency distributions reveal patterns such as clustering and skewness. Expected value summarises uncertain outcomes using probabilities, but it should be interpreted alongside downside outcomes and risk appetite.

For further study, read broadly in business statistics and decision-making under uncertainty, focusing on practical interpretation as well as calculation.

FAQ

Why is the median often preferred over the mean in skewed datasets?

Because the median depends on position rather than the size of the values. A small number of unusually large or small observations can shift the mean substantially, while the median typically remains stable, giving a more representative “middle” for skewed data.

How does the coefficient of variation help in comparing datasets?

It scales variability to the level of the mean. Two datasets may have different average sizes; CV expresses spread as a percentage of the mean, supporting more meaningful comparisons of relative volatility.

What are the limitations of using range as a measure of dispersion?

Range uses only the maximum and minimum values and ignores the distribution of the rest of the data. It can be heavily influenced by a single unusual observation and provides no information about how values cluster around the mean.

How is expected value used in decision-making under uncertainty?

Expected value combines outcomes and probabilities into a single average figure, useful for comparing options in repeatable decisions. However, it is not the most likely outcome and it does not show risk. A complete answer should also comment on downside outcomes and their likelihood.

Why is standard deviation preferred over variance for interpreting spread?

Standard deviation is expressed in the original units (e.g., pounds, minutes, units), so it can be interpreted directly alongside the mean. Variance is in squared units and is therefore less intuitive.

Summary (Recap)

This chapter explained how to describe datasets using measures of central tendency and dispersion, how to compare relative variability using the coefficient of variation, how to construct and interpret simple frequency distributions, and how to calculate expected values for decisions under uncertainty. The worked example demonstrated each technique using weekly sales data and reinforced that interpretation should distinguish between “average outcome” and risk.

Glossary

Mean
An average calculated by dividing the total of all observations by the number of observations. It reflects every data point and can be affected by extreme values.

Median
The middle value after sorting the data (or the average of the two middle values if there is an even count). Often robust when data are skewed.

Mode
The most frequently occurring value. A dataset may have one mode, multiple modes, or none.

Range
Maximum minus minimum. A quick spread measure based only on the two extremes.

Variance
A spread measure based on squared deviations from the mean. It can be calculated using a population (÷ n) or sample (÷ n − 1) approach.

Standard deviation
The square root of variance, expressing spread in the same units as the original data.

Coefficient of variation
Standard deviation divided by mean (usually shown as a percentage). Compares variability relative to average level.

Expected value
A probability-weighted average of outcomes: Σ(outcome × probability). Probabilities must total 1 for a complete set of outcomes.

Frequency distribution
A summary that groups data into intervals and counts how many observations fall into each interval.

Probability
A measure of likelihood between 0 and 1. For a complete set of outcomes, probabilities should total 1.

Test your knowledge

Practice questions specifically for this topic.

Practice this topic Browse all question sets

Written by

AccountingBody Editorial Team