Classic Computer Magazine Archive COMPUTE! ISSUE 30 / NOVEMBER 1982 / PAGE 80

Statistician

Louis F. Sander
Pittsburgh

For Apple, Atari, and Commodore computers — learn and examine statistics the easy way with this handy program.

Statistician is a useful program for handling and analyzing the statistical data that many of us encounter in our work and home life. Written in formats for several home computers, it can be useful to teachers in analyzing test scores, to businessmen in analyzing sales, to the curious in learning about statistics, or to any one whose life involves coping with more than a handful of numbers. Any time you're called on to cope with up to a 100 or so, call on Statistician and stand by.

The program lets you enter a series of numbers in any sequence. It quickly analyzes them, giving them back to you sorted and grouped, along with their total, their mean and median, and half a dozen other useful statistical measurements. Although the program is self-explanatory, this article explains it further and provides some examples.

Entering Data

Key in your version of the program, and follow us through the screens. The first one asks whether your data are "special" in any way. If you have more than 99 entries to make, you'll have to estimate how many; be generous, because the program won't take any more items than you prepare it for. If your data consist of groups with the same value, e.g., "4 grades of 95" rather than four single entries of 95, you must say so in advance. Likewise, you must let the computer know if your data are a sample from a population, since this makes a difference in calculating standard deviation. Most people won't need this feature, but it's there if you want it.

If there's nothing special about your data, just hit a key and you're on your way. Enter one data item at a time in response to each prompt, and hit RETURN when you're finished. Don't worry about the order in which you enter items, but do be careful, since you cannot change a number once it's been input. If you're entering grouped data, the FREQ entry is the one for the number of occurrences of each item.

Statistical Measurements

Once you have finished, Statistician will quickly give you these seven measurements, then take some time to sort your data: #ENTRIES is merely the number of items or groups you have put into the computer. #DATA is the total number of data points involved. Two entries, each with a FREQ of five, would give ten data points. RANGE gives the value of the smallest and largest data points you entered. TOTAL is the sum of all your data. MEAN can be thought of as the average of all the data, and VARIANCE and STD DEV are statistical measures of how far your data extend from the mean. Any elementary statistics book will explain these terms.

Your machine can't give you the MEDIAN until it's finished sorting all the data. Most sorts are finished in just a few seconds, but some can take awhile. One trial sort of 98 random data items took 47 seconds – not too shabby when compared to a manual sort. The MEDIAN item in a group of data is the one which is halfway between the smallest and the largest members; half the members are above the median, and half are below, as anyone who's been "graded on a curve" will readily tell you. If there are an even number of data items, it's possible that the median falls between two items. When that happens, Statistician splits the difference between them and tells you that it did so.

As soon as the median is calculated, the program displays the data items in sequence from low to high and shows the frequency of occurrence of each. It also shows the cumulative frequency, in case you want to know something like the 20th item from the bottom of the list. In cases where you have more data than will fit on one screen, you can page through it as many times as you wish.

Examples

So much for the explanations; let's try some examples. The three which follow will illustrate some of Statistician's uses. I hope they will amuse you, and convince you of some of the advantages of computing.

  • Example 1. These are the prices of the computer accessories on Bill Boole's birthday wish-list: $75, $95, $80, $22.50, $149, $10.95, $195, $19.95, $29.95, $55, $5.95. What is the average price of these goodies? Although some of the items are expensive, some are quite reasonably priced. In fact, half of them cost less than what amount? How much would it take to buy everything on the list?
  • Example 2. These are the ages of the cars parked in the main lot at CD Computer Store:
    Age 10 5 4 3 2 1
    Cars 1 2 5 8 4 3
    
    What is the average age of the cars?
  • Example 3. Here are some numbers from the throws of a single die: 6, 5, 3, 4, 6, 1, 2, 3, 1, 6, 1, 4, 2, 4, 2, 5, 4, 4, 6, 1, 4, 5, 4, 1, 3, 5, 4, 2, 5, 6. What is the total of these throws, what is the mean throw, and what is the standard deviation of this group of throws? If you were Bill Boole, and your birthday was six months off, which number do you wish you had been betting on?