Sheldon Ross

# Q-Q plots

Purpose: A Q-Q plot is a graphical tool for deciding whether data come from a distribution of a specific type. In FDMath 22x, the interest is in normal Q-Q plots, as tools to decide whether data come from a normal distribution. ("Q" stands for "quantile," and it refers to the way the plot is constructed. An explanation is given at the end of this web page.) My students actually use a slightly different plot, called the "normality" plot in SPSS. Everything below applies just as well to normality plots as to Q-Q plots, with the exception that the creation of the normality plot is described, but not that of the Q-Q plot.

Requirements: Your data must be numeric (quantitative) at the interval or ratio level of measurement (which SPSS calls the "scale" level).

How to construct it: Use the "Normality Plots with Tests" option in the "Plots" options dialog in the SPSS "Explore" dialog. For details, click here.

Interpretation: Start by taking the point of view that your data come from a plausibly normal distribution, until or unless you have sufficient evidence to the contrary. Create your Q-Q plot. Note the line rising from the lower left corner of the plot to the upper right. If the data are perfectly normally distributed, all the little circles in the plot will lie on that line. However, nothing is ever perfect, so:

• Because you sample randomly, you should expect the circles to be randomly scattered about the line.
• Consequently, if your plot shows any definite, overall, non-random pattern, you can conclude that your data do not come from a plausibly normal distribution. A little wriggle about the line does not mean the data are not normal. A clear, overall curved pattern does. Remember: Data are normal until "proven" otherwise. See the examples for clarification.
• The presence of circles far away from the line at either the left or the right end of the plot does not necessarily mean the data do not come from a plausibly normal distribution. Students often want to call such points ouitliers, but such points may or may not be outliers. Again, refer to the examples.

Note: There are many types of plots that are similar to the Q-Q plot: The P-P plot ("P" is for "probability" and "Q" is for "quantile"), the P-Q plot, the Q-P plot, the normal probabilty plot, the normal quantile plot, and so on. The differences among these plots are all technical; all these types of plots can be read in much the same way: As long as the points of the plot lie in an essentially straight pattern (with some random wriggle), you may treat your data as though they were normal. Otherwise, you may not.

How the Q-Q plot is constructed (ignoring certain techincal details): Roughly speaking, SPSS creates the Q-Q plot by calculating the z-score each measurement should have, if the data come from a normal distribution (the "Expected Normal" score). For each measurement, SPSS plots a point whose horizontal coordinate is the measurement itself and whose vertical coordinate is the expected normal score. If your data come from a normal distribution, the observed value and expected normal scores of the data should line up nicely, in the sense that the little circles on the graph should lie close to a line, specifically the line SPSS puts in the Q-Q plot.