## Representing Data - Statistics 1

The mean, mode and median are types of averages. They are different methods for working out he average for the data. If the data is skewed then it is best to use the median for the average as it will be the most accurate. The mean is affected by extreme values and the number that is the mode is just by chance and therefore not the best average to use.

The range, interquartile range, variance and standard deviation all measure the spread of the data. The larger these values are the more spread out the data is, the smaller these values are the less spread out the data is. If the data is skewed then it is best to use the interquartile range as it will be the most accurate. The range, variance and standard deviation will be affected by extreme values.

The height of 100 people, chosen at random, was measured to the nearest centimetre and the data is summarised below.

__Averages__The mean, mode and median are types of averages. They are different methods for working out he average for the data. If the data is skewed then it is best to use the median for the average as it will be the most accurate. The mean is affected by extreme values and the number that is the mode is just by chance and therefore not the best average to use.

__Spread of Data__The range, interquartile range, variance and standard deviation all measure the spread of the data. The larger these values are the more spread out the data is, the smaller these values are the less spread out the data is. If the data is skewed then it is best to use the interquartile range as it will be the most accurate. The range, variance and standard deviation will be affected by extreme values.

__Frequency Table__The height of 100 people, chosen at random, was measured to the nearest centimetre and the data is summarised below.

Height (cm) |
Frequency |

160 – 169 |
40 |

170 – 179 |
55 |

180 – 190 |
5 |

If the data is measured to the nearest cm then you first need to work out the correct class widths:

Height (cm) |
Frequency |

159.5 – 169.5 |
40 |

169.5 – 179.5 |
55 |

179.5 – 190.5 |
5 |

Work out the sum of the midpoints of the correct class width multiplied by its corresponding frequency and then divide by the total frequency. This will be an estimate for the mean as the exact values of the data are unknown and you are assuming they are the midpoint.

__Mean__Work out the sum of the midpoints of the correct class width multiplied by its corresponding frequency and then divide by the total frequency. This will be an estimate for the mean as the exact values of the data are unknown and you are assuming they are the midpoint.

You need to work out the variance first and then square root it to get the standard deviation.

__Standard Deviation__You need to work out the variance first and then square root it to get the standard deviation.

It is not possible to work out the mode as the exact values of the data are unknown. However you can work out the modal class width. This will be the class width with the highest frequency.

Modal class width = 169.5 – 179.5

To work out the lower quartile Q1, Median Q2, and upper quartile Q3 you need to use linear interpolation. You need to work out the quarter value, the middle value, and the three quarter value for the amount of numbers in the data set which is the total frequency. Depending on your exam board, you will either use n/4, n/2 and 3n/4 respectively

Lower quartile value = 100/4 = 25

Middle value = 100/2 = 50

Upper quartile value = 3(100)/4 = 75

You then need to work out the cumulative frequency to work out the class width each value lies in.

__Mode__It is not possible to work out the mode as the exact values of the data are unknown. However you can work out the modal class width. This will be the class width with the highest frequency.

Modal class width = 169.5 – 179.5

__Median, Lower Quartile and Upper Quartile__To work out the lower quartile Q1, Median Q2, and upper quartile Q3 you need to use linear interpolation. You need to work out the quarter value, the middle value, and the three quarter value for the amount of numbers in the data set which is the total frequency. Depending on your exam board, you will either use n/4, n/2 and 3n/4 respectively

**OR**(n+1)/4, (n+1)/2 and 3(n+1)/4. In this example, I have used the first equations.Lower quartile value = 100/4 = 25

Middle value = 100/2 = 50

Upper quartile value = 3(100)/4 = 75

You then need to work out the cumulative frequency to work out the class width each value lies in.

Height (cm) |
Frequency |
Cumulative Frequency |

159.5 – 169.5 |
40 |
40 |

169.5 – 179.5 |
55 |
95 |

179.5 – 190.5 |
5 |
100 |

The 50th value lies in the second class width, the 25th value lies in the first class width and the 75th value lies in the second class width. Then to work out the values use the following formulae:

Interquartile Range = upper quartile – lower quartile

Interquartile Range = 176 – 166 = 10cm

Add up the values and divide by the amount of numbers.

__Interquartile Range__Interquartile Range = upper quartile – lower quartile

Interquartile Range = 176 – 166 = 10cm

__List of numbers without a frequency table__**Data:**3, 3, 4, 4, 5, 5, 5, 6, 6, 9

__Mean__Add up the values and divide by the amount of numbers.

You need to work out the variance first and then square root it to get the standard deviation.

__Standard Deviation__You need to work out the variance first and then square root it to get the standard deviation.

It is the most common number in the data set. If two numbers are equally common then both numbers will be the mode.

Mode = 5

First make sure the data is listed from smallest to largest. To work out the lower quartile Q1, Median Q2, and upper quartile Q3 you need to use linear interpolation. You need to work out the quarter value, the middle value, and the three quarter value for the amount of numbers in the data set which is the total frequency.

__Mode__It is the most common number in the data set. If two numbers are equally common then both numbers will be the mode.

**Data:**3, 3, 4, 4, 5, 5, 5, 6, 6, 9Mode = 5

__Median, Lower Quartile and Upper Quartile__First make sure the data is listed from smallest to largest. To work out the lower quartile Q1, Median Q2, and upper quartile Q3 you need to use linear interpolation. You need to work out the quarter value, the middle value, and the three quarter value for the amount of numbers in the data set which is the total frequency.

**Data:**3, 3, 4, 4, 5, 5, 5, 6, 6, 9

Interquartile Range = upper quartile – lower quartile

Interquartile Range = 6 – 4 = 2

To work out the Mean, Standard Deviation, Mode, Median, Q1, Q2, Q3 and the IQR, you will use the same methods as for a list of numbers without a frequency table with the numbers in the stem and leaf diagram.

The only values you can work out from a Box Plot are the Interquartile range, the range, and the Median. You can read what the Median is from the Box Plot. You can read the upper quartile and lower quartile values from the box plot and you can subtract them to work out the interquartile range. You can read the highest and lowest values from the box plot and you can subtract them to work out the range.

Histograms are drawn from frequency tables therefore you can use the frequency table to work out the Mean, Standard Deviation, Mode, Median, Q1, Q2, Q3 and the IQR. The class widths will be the widths of the rectangles and height of the rectangles will be the Frequency Density. Fd = class width / frequency. The area of the rectangles will represent frequency since frequency = Fd x class width. The height and width of the rectangles can be in different ratios. The Fd values can be times by 10 and the class widths can be times by 2 before the Histogram is drawn.

You draw cumulative frequency graphs from frequency tables. You use the upper limit of the class widths as the x-coordinate, and you use the cumulative frequency as the y-coordinates for the points that you plot. You need connect the points with a smooth curve. The lower limit of the first class width will be where the graph starts from on the x-axis.

You draw frequency polygons from frequency tables. You use the middle value of the class widths as the x-coordinate, and you use the frequency as the y-coordinates for the points that you plot. You need connect the points with straight lines. The frequency polygon will start from the first point and end on the last point plotted.

__Interquartile Range__Interquartile Range = upper quartile – lower quartile

Interquartile Range = 6 – 4 = 2

__Stem and Leaf Diagrams__To work out the Mean, Standard Deviation, Mode, Median, Q1, Q2, Q3 and the IQR, you will use the same methods as for a list of numbers without a frequency table with the numbers in the stem and leaf diagram.

__Box Plots__The only values you can work out from a Box Plot are the Interquartile range, the range, and the Median. You can read what the Median is from the Box Plot. You can read the upper quartile and lower quartile values from the box plot and you can subtract them to work out the interquartile range. You can read the highest and lowest values from the box plot and you can subtract them to work out the range.

__Histograms__Histograms are drawn from frequency tables therefore you can use the frequency table to work out the Mean, Standard Deviation, Mode, Median, Q1, Q2, Q3 and the IQR. The class widths will be the widths of the rectangles and height of the rectangles will be the Frequency Density. Fd = class width / frequency. The area of the rectangles will represent frequency since frequency = Fd x class width. The height and width of the rectangles can be in different ratios. The Fd values can be times by 10 and the class widths can be times by 2 before the Histogram is drawn.

__Cumulative Frequency Graphs__You draw cumulative frequency graphs from frequency tables. You use the upper limit of the class widths as the x-coordinate, and you use the cumulative frequency as the y-coordinates for the points that you plot. You need connect the points with a smooth curve. The lower limit of the first class width will be where the graph starts from on the x-axis.

__Frequency Polygons__You draw frequency polygons from frequency tables. You use the middle value of the class widths as the x-coordinate, and you use the frequency as the y-coordinates for the points that you plot. You need connect the points with straight lines. The frequency polygon will start from the first point and end on the last point plotted.