Math 221 – Principles of Statistics

Creating Scatterplots using SPSS

______________________________________________________________ 

Start SPSS if you haven’t already and turn in your text to pages 88 and 89. Read these two pages carefully. We will construct scatterplots for these data, starting with a plot of sales vs. items sold (that is, we’re going to try to reproduce Figure 2.1). Open the data file ta02_001.por, look over the data and make sure you know what it all means. Go the “Graphs” menu and select “Scatter/Dot…” A dialog box appears. We are going to make just one scatterplot (for starters), and we have no grouping variables, so this is a very simple situation. So we select “Simple Scatter” and click “Define.”

 

A new dialog box appears. To say, “sales versus items sold” means “sales” goes on the vertical axis (or y-axis) and “items” goes on the horizontal axis (or x-axis). So put those two variables into their respective boxes in the dialog box. If you want to add a title, you can do that by clicking “Title…” and filling in the boxes appropriately. Once you’re done, click “Continue” and “OK.”  That’s all there is to it! You should see this in your Output Viewer (except your title is probably different than the one shown here.)

 

 

Now you may interpret to your heart’s content.

 

If you want to make more than one scatterplot at once, you can. Let’s use the same data set for this. In particular, let’s see how the variables “sales,” “check,” and “card” relate to each other. Go to Graphs->Scatter/Dot…,” select “Matrix Scatter” and click “Define.” Put the variables [sales], [chck], and [card] in the “Matrix Variables:” box. Add a title, if you like, and click “OK.” You should get something very like this:

 

First, note that we have three different scatterplots here. You may think there are six, but look very carefully, and you’ll see that the one in the row labeled “chck” (check) and in the column “sales” is the mirror image of the one in the row labeled “sales” and the column labeled “chck.” In fact, the three plots in the upper right portion of the matrix are mirror images of the ones in the lower left part.

 

Now, the labels on the axes tell you which variables are plotted in each plot. For example, in the lower left-hand corner, we have a plot of “card” versus “sales.” Looks pretty linear to me, though we may have an outlier (the point farthest right.) This isn’t too surprising, as an increase in credit card sales should be associated with an increase in sales.

 

Next to this plot, on the right of it, we have a plot of “card” versus “chck.” There seems to be no recognizable pattern in this plot, and there could be three or four outliers.

 

In the next row up (the middle row), and in the left-most column, we have a plot of “chck” versus "sales.” Again, there seems to be a positive linear association between these two variables. It seems fairly strong, at that. The uppermost point (which is also the point farthest right) could be an outlier.

 

These are the only three plots we need to interpret, since the other three are mirror images of these three.

 

 

Examples of scatterplots and their interpretation:

 

Click here.

 

 

Related topics:

 

Pearson’s (linear) correlation coefficient

 

Simple linear correlation

 

Simple linear regression

________________________________________________________________

David E. Brown

BYU-Idaho                                            mailto:brownd@byui.edu

232 Ricks Building                              208-496-1839 voice

Rexburg, ID 83460                              208-496-2005 fax

                Please do not call me at home.