7.6. SPSS-X

7.6.1. INTRODUCTION TO SPSS-X

In industry, many times a user will require a one-time report. This can be done in Cobol, but often a simple report generating program, like SPSS, is used instead.

SPSS-X is a statistical package designed to create and analyze reports. Because of its design, it's easier to use than a full programming language like COBOL, yet is very flexible.

7.6.1.1. STARTING AN SPSS-X PROGRAM

* Use Xedit to create the SPSS-X program with a file type of SPSSX

* Comments are enclosed in /* and */ (like C or REXX comments).

* a CMS FILEDEF is not needed, since the name is defined in the SPSS-X program. Simply type SPSSX followed by the name of program. For example, to run the program SAMPLE SPSSX, enter:

SPSSX SAMPLE [Return]

* Program output appears in a file with the program name, but with a type of LISTING. For example, the output of the file "SAMPLE SPSSX" would be a file named "SAMPLE LISTING".

* The main steps in an SPSSX program are:

1. Define the files.

2. Define the data format.

3. Specify necessary transformation commands to prepare the data.

4. Use a procedure to report, analyze, or summarize the data.

* Refer to the SAMPLE SPSSX program at the end of this chapter for an example of an SPSS-X program.

7.6.1.2. DEFINING INPUT FILES

* Data files are plain files that can be created by Xedit. Each record is referred to as a case.

* A "FILE HANDLE" command assigns an internal handle (an alias) to an input or output file. In other words, the file handle is just another name assigned to the input (or output) file, so that the complete file specification need not be given each time it is referenced. This allows compatibility between versions of SPSS-X on computers with different file specifications - if the program is moved from Max to Meena, only the FILE HANDLE command has to be changed.

* Syntax for FILE HANDLE is:

FILE HANDLE handle/NAME = 'FILENAME FILETYPE A'

e.g. (See SAMPLE SPSSX)

FILE HANDLE QWERTY/NAME = 'SAMPLE DATA A'

7.6.1.3. DEFINING DATA FORMAT

* Certain variables, such as $CASENUM, are reserved. Variables can be no more than 8 characters in length, must begin with an alphabetic and can contain letters, digits, a period, or an underscore (no hyphens or spaces).

* A "DATA LIST" command defines the names of the variables to be read from a data file and the positions for each variable in each case (record). The syntax for DATA LIST is:

DATA LIST FILE = handle / variable_name start_column-end_column (form)

Where: form refers to the type of data.

- If not specified, SPSS-X assumes a variable is numeric.

- "(n)" is used to specify the number of implied decimal points.

- "(A)" represents Alphanumeric, indicating that the variable is a string variable.

e.g. (Refer to SAMPLE SPSSX)

DATA LIST FILE = QWERTY /

NAME 1-10 (A) CODE 11-13

PRICE 14-19 (2) PROVINCE 20-23 (A)

The variables are defined as coming from QWERTY, which is defined by a FILE HANDLE command.

NAME is a string variable read from character position 1 to 10.

CODE is a three digit number at positions 11 to 13.

PRICE is a number read from position 14 to 19. The last 2 digits are the fractional part of the number; this is indicated by the "(2)" following the column numbers.

PROVINCE is a four character string.

7.6.1.4. TRANSFORMATION COMMANDS

Transformation commands MUST begin in column 1. They may be indented (e.g. when they are in an IF-ELSE construct) if they are preceded with a plus sign (+).

7.6.1.4.1. COMPUTE

Arithmetic operations can be performed as in Cobol.

e.g. COMPUTE cost = price + (price * tax)

SPSS-X is interpretive, so a variable is created if it does not already exist.

7.6.1.4.2. IF-ELSE

Transformation commands can be conditionally executed by placing them in an IF-ELSE construct. Remember if you wish to indent a command, it must have a plus sign in column one. The syntax of IF-ELSE is:

DO IF expression

+ command(s)

ELSE

+ command(s)

END IF

e.g. DO IF province EQ 'sask'

+ COMPUTE price = price + (price + 0.10)

END IF

Relational and logical operators can be used in IF-ELSE expressions.

Relational Operators       Logical Operators        
EQ  equal                  AND                      
NE  not equal              OR                       
LT  less than              NOT                      
LE  less than equal to                              
GT  greater than                                    
GE  greater than equal to                           

7.6.1.4.3. SELECT IF

Cases (records) can be selected for printing if some condition is met. The syntax for a SELECT IF is:

SELECT IF expression

Relational and logical operators can be used in the SELECT statement.

7.6.1.4.4. SORT CASES

Sorts all records by the specified key.

e.g.

SORT CASES BY var1 (A) var2 (D)

where:

var1 and var2: are variables that are used as the sort keys.

A and D: indicate the direction.

(A) means Ascending and (D) means Descending.

Direction is optional, ascending is assumed.

7.6.1.4.5. FORMATS

Defines the print formats of all variables. Formats are FORTRAN style, and all numbers default to (F8.2) if they aren't defined by the FORMATS command. The syntax is:

FORMATS variable_name (TYPE)

Where:

TYPE is one of the following FORTRAN data types:

Fw - Floating point numeric, leading zeros suppressed

Fw.d - Floating point numeric with decimal, leading zeros suppressed

Zw and Zw.d - Floating point, no zero suppression

COMMAw and COMMAw.d - Floating point, but with commas.

DOLLARw.d - Floating point displayed as dollars.

Aw - String format.

e.g. (Refer to SAMPLE SPSS-X)

FORMATS NAME (A10) CODE (F3)

PRICE (DOLLAR6.2) PROVINCE (A4)

Where:

NAME is a 10 character string

CODE is a 3 digit floating point, no decimals and leading zeros suppressed

PRICE is a numeric variable, 4 digits to the left of the decimal point and 2 digits to the right, printed in dollar format

PROVINCE is a 4 character string

7.6.1.5. PROCEDURES

REPORT prints out a detailed report. The syntax is:

REPORT FORMAT = LIST

MARGINS (LEFT,RIGHT) /

VARIABLES = VAR1(WIDTH) 'title1'

VAR2(WIDTH) 'title2' /

TITLE = 'whatever'

Where:

/ (the trailing slash character at the end of a line)

separates each specification.

FORMAT

Sets the general characteristics of the report. Usually default settings are adequate. The characteristics are:

LIST

Tells REPORT to list the variables. Normally REPORT only prints the summaries.

MARGINS

Sets the position of the left and right margins. The default margins are 1 and 132. Asterisks can be used to specify default settings of the margins option can be omitted altogether.

VARIABLES

Specifies the information on each report variable, they can appear in any order (regardless of position in the input file).

e.g. See SAMPLE SPSS-X

REPORT FORMAT = LIST /

VARIABLES = PROVINCE (8) 'Province'

NAME (10) 'Name '

CODE (6) 'Number'

PRICE(6) 'Price' /

NOTE ** that the width of a column must be wide enough for the variable and the column title.

7.6.1.6. SAMPLE PROGRAM


/* This is a sample SPSSX program */

/* It is called "SAMPLE SPSSX" in the cs271 Max account */

/* The data file is called "SAMPLE DATA" */

/* Copy it to your account and run it by entering: SPSSX SAMPLE */

/* Look at the resultant output in the file SAMPLE LISTING */

/*==============================================*/

/* FILE DEFINITION SECTION */

FILE HANDLE QUERTY/NAME = 'SAMPLE DATA A'

/*===============================================*/

/* DATA FORMAT SPECIFICATIONS */

DATA LIST FILE = QUERTY / /* <- slash needed here */

NAME 1-10 (A) CODE 11-13

PRICE 14-19 (2) PROVINCE 20-23 (A)

/*===============================================*/

/* TRANSFORMATION COMMANDS */

SORT CASES BY PROVINCE NAME

DO IF PROVINCE EQ 'SASK'

+ COMPUTE PRICE = PRICE + (PRICE * 0.10)

END IF

FORMATS NAME (A10)

CODE (F3)

PRICE (DOLLAR6.2)

PROVINCE (A4)

/*===============================================*/

/* PROCEDURE SECTION */

REPORT FORMAT = LIST

MARGINS (1, 40) / /* <- slash needed before "Variables" */

VARIABLES = PROVINCE (8) 'Province'

NAME (10) 'Name '

CODE (6) 'Number'

PRICE (6) 'Price' / /*<- end Variables with slash */

TITLE = 'Sample SPSS Output'

7.6.2. REPORT WRITING IN SPSS-X

A report is generally more than a simple listing of records. Often 'control breaks and summaries' are required. For example, a listing of personnel records is often generated in groupings of departments with some summary statistics for each department. This type of report can be created fairly easily with SPSS-X.

7.6.2.1. COMPUTE (REVIEW)

Arithmetic operations can be performed as in Cobol using COMPUTE. Because SPSS-X is interpretive, a variable is created if it doesn't already exist. However, if you wish to print the value of the computed variable, you must specify the print format in the FORMATS command before the Procedure Section.

7.6.2.2. REPORT

7.6.2.2.1. FORMAT SUBCOMMAND

An easy way to get column headers underlined is to include the keyword automatic before list in the Format subcommand Report statement.

7.6.2.2.2. OUTFILE SUBCOMMAND

Also, it is often desirable to have the output (the report) placed in a separate file; to do this use the outfile= specification.

i.e.

REPORT FORMAT = AUTOMATIC LIST

MARGINS (1 , 40)

LENGTH ( * , * ) /

OUTFILE = reportname /

VARIABLES = . . .

/

The output would be placed in a file called 'reportname LISTING'.

7.6.2.3. BREAK & SUMMARY SUBCOMMANDS

There are two Report subcommands used to specify control breaks and summary statistic generation. These subcommands are

BREAK = and SUMMARY =

They are placed at the end of the REPORT command in the Procedure section.

The general form of the BREAK command is:

BREAK = variable (width) 'column-heading' /

Note: the slash must be present at the end of the line.

A general form of the SUMMARY command is:

SUMMARY = statistic (variable) 'column-heading' statistic (variable) ... /

Note: the slash must be present at the end of the line.

One or more statistics can be requested for each summary line.

SPSS-X provides functions to generate a variety of statistics such as:

VARIANCE MIN

SUM MAX

MEAN PERCENT

For example, suppose you desire a report of personnel grouped by department and wanted summary statistics of salary in each department. Your report command might look like this:

      REPORT FORMAT = AUTOMATIC LIST

MARGINS = (1,40)

LENGTH = (*,*) /

OUTFILE = DEPLIST /

VARIABLES = NAME (20) 'Employee'

GROSS (10) 'Before'

NET (10) 'After' /

CTITLE = 'Personnel Report' /

BREAK = DEPT (14) 'Department' /

SUMMARY = SUM (GROSS) 'Total Wage' /

SUMMARY = SUM (GROSS) 'Before,After' SUM (NET)

7.6.2.4. SPSS-X REPORTING EXAMPLE

/* This program reads temperatures for cities on the various */

/* days of the month and prints out a report of these with */

/* the average monthly temperature.*/

/*==============================================*/

/* FILE DEFINITION SECTION */

FILE HANDLE INFILE/NAME = 'TEMP DATA A'

/*===============================================*/

/* DATA FORMAT SPECIFICATIONS */

DATA LIST FILE = INFILE / /* <- slash needed here */

CITY 1-20 (A) MONTH 21-24 (A)

DAY 25-26 TEMP 27-29

/*===============================================*/

/* TRANSFORMATION COMMANDS */

SORT CASES BY CITY MONTH

COMPUTE FAHR = TEMP*9/5 +32

FORMATS

MONTH (A4)

DAY (F2)

TEMP (F3)

FAHR (F3)

/*===============================================*/

/* PROCEDURE SECTION */

REPORT FORMAT = AUTOMATIC LIST

MARGINS (1, 80) / /* <- slash needed before "Variables" */

OUTFILE = TEMP / /* report -> TEMP LISTING */

VARIABLES =

MONTH (6 ) 'MONTH'

DAY (4) 'DAY'

TEMP (12) 'TEMPERATURE'

FAHR (11) 'FAHRENHEIT' /

BREAK = CITY (21) 'CITY' /

SUMMARY = MEAN(TEMP) MEAN(FAHR)/

TITLE = 'TEMPERATURE SUMMARY'

Output(TEMP LISTING)

TEMPERATURE SUMMARY

CITY MONTH DAY TEMPERATURE FAHRENHEIT

_____________________ ______ ____ ____________ ___________

REGINA MAR 19 13 55

MAR 20 22 72

Mean 18 63

SASKATOON MAR 19 22 72

MAR 20 14 57

Mean 18 64