Types of data, frequency distributions, histograms, bar charts, and frequency polygons.
Statistics begins with organising information so that patterns can be seen. A table or diagram is not just a picture; it is a way to make a dataset easier to read, compare, and interpret.
In CSEC, Statistics questions may ask you to construct a diagram, read a value from a table, or explain what the data suggests. Always identify the type of data first, because that determines whether a bar chart, histogram, pie chart, line graph, or frequency polygon is appropriate. Your explanation should connect the diagram back to the data, not just describe its shape.
Understanding what kind of data you have is the first step in analysis.
The type of data determines what calculations and diagrams make sense. You can average numerical data, but not categories like favourite colour.
Quantitative data = numerical, can be measured
Qualitative data = descriptive, not numerical
Discrete data is counted in separate values; continuous data is measured and can take values between marks on a scale.
Discrete data = counted, takes specific values only
Continuous data = measured, can take any value in a range
Ungrouped data shows individual values. Grouped data sacrifices exact detail to make large datasets easier to summarise.
Ungrouped data = individual values listed separately
Grouped data = values organized in classes/intervals
Classify these data sets:
"Heights of 100 students: 1.50 m, 1.52 m, 1.67 m, ..."
"Colors preferred by 50 people: red, blue, red, green, ..."
"Number of books read: 5, 7, 8, 9, 11, ..."
A frequency table organizes data by showing how often each value occurs.
A frequency table counts how often each value appears. It makes repeated data easier to read and prepares it for graphs or averages.
Test scores for 20 students:
5, 7, 5, 8, 9, 5, 7, 6, 8, 7, 5, 9, 6, 8, 7, 9, 5, 6, 8, 7
Frequency table:
| Score | Frequency | Cumulative Frequency |
|---|---|---|
| 5 | 5 | 5 |
| 6 | 3 | 8 |
| 7 | 5 | 13 |
| 8 | 4 | 17 |
| 9 | 3 | 20 |
| Total | 20 |
Grouped frequency tables are used when there are many different values. Each class interval should be clear and non-overlapping.
For large datasets, group values into class intervals.
Heights of 40 students (in cm):
SAMPLE DATA: 150, 152, 155, 158, 160, 161, 162, 165, 167, 168, 170, 171, 172, 174, 175, 176, 178, 180, 182, 183, 185, 186, 187, 188, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, ...
Grouped frequency table:
| Class Interval | Frequency | Cumulative Frequency |
|---|---|---|
| 150-159 | 5 | 5 |
| 160-169 | 8 | 13 |
| 170-179 | 12 | 25 |
| 180-189 | 10 | 35 |
| 190-199 | 5 | 40 |
| Total | 40 |
Class boundaries and midpoints help you calculate and graph grouped data. The midpoint stands in for all the values in that interval when exact values are unavailable.
When grouping data, understand these terms:
Class interval = the range (e.g., 150-159)
Class boundaries = the true limits (used for graphs)
Class width = difference between boundaries
Class midpoint = center of the class
For the class interval 170-179:
Different diagrams show data in different ways.
Pie charts show parts of a whole. Each sector angle is proportional to the category frequency.
A pie chart shows data as slices of a circle. Each slice is proportional to the frequency.
Favorite fruit for 60 students:
| Fruit | Frequency | Angle |
|---|---|---|
| Apple | 20 | (20÷60)×360° = 120° |
| Banana | 15 | (15÷60)×360° = 90° |
| Orange | 15 | (15÷60)×360° = 90° |
| Mango | 10 | (10÷60)×360° = 60° |
| Total | 60 | 360° |
Bar charts compare separate categories or discrete values. The gaps between bars show that the categories are separate.
A bar chart uses rectangular bars to show frequencies. Good for categorical or discrete data.
Number of cars sold per day (Monday-Friday):
| Day | Cars Sold |
|---|---|
| Monday | 15 |
| Tuesday | 12 |
| Wednesday | 18 |
| Thursday | 20 |
| Friday | 14 |
Histograms show grouped continuous data. Bars touch because the intervals run continuously into each other.
A histogram is like a bar chart, but for continuous grouped data. No gaps between bars.
Heights grouped into classes:
A frequency polygon joins class midpoints to show the shape of a distribution. It is useful for comparing two grouped datasets on the same axes.
A frequency polygon connects the midpoints of each class with straight lines. Shows the shape of the distribution.
Using the same height data:
Plot a point at each class midpoint at its frequency height, then connect with straight lines.
Line graphs show change over time or ordered values. The pattern of rise, fall, and flat sections is the main information.
A line graph shows how a variable changes over time.
When to use which diagram: