7.5. Cobol

7.5.1. INTRODUCTION TO COBOL

7.5.1.1. OVERVIEW

COBOL is an acronym for COmmon Business Oriented Language. The design philosophy is that Cobol programs should be easy to create/modify. Statements such as MOVE and ADD are designed to make the program self-documenting. The language as a whole is very ''English-like". There are: Divisions, Sections, Paragraphs, and Sentences with clauses. Periods are used for punctuation.

Every COBOL program consists of four separate divisions, each with a specific function:

IDENTIFICATION DIVISION Identifies the program to the computer.

It also provides some documentation

about the program.

ENVIRONMENT DIVISION Describes the specific computer equipment

that will be used by the program.

DATA DIVISION Describes the input and output formats

to be used by the program. It also defines

any constants and work areas

necessary for the processing of data.

PROCEDURE DIVISION Contains the instructions necessary for

reading input, processing it, and creating

output.

Structured Cobol Programming by Stern & Stern

Cobol programmers are aware of this basic structure but most simply use an existing program as a template.

7.5.1.2. FORMAT

There are a few formatting restrictions to remember.

Column 7 An asterisk in column 7 indicates a comment on that line.

Column 8 This is referred to as "AREA A". Division, Section, and Paragraph names must begin here. All these names must be terminated with a period. Certain items in the Data Division must also begin in Area A.

Column 12 This is referred to as "AREA B". All executable statements in the Procedure Division must start here.Statements must end with a period. Also starting in Area B are file description components such as RECORD CONTAINS, and elementary data items.

Example:

7.5.1.3. VARIABLE DECLARATIONS

Variables (defined in the Data Division) are described as being either GROUP or ELEMENTARY data items. Level numbers are assigned to everything.

e.g.

01 ADDRESS

05 STREET-NUMBER

05 STREET

"01" is used for a group item (which must begin in Area A). Elementary items begin in Area B; the level numbers are selected by the user - usually in increments of 5 to allow room for further subdivision if it becomes necessary.

Variables for files are defined in the Data Division in either the File Section or the Working-Storage Section. However it's considered poor form to place a detailed variable description in the File Section. Since this is actually just a buffer area, the contents are not included in a dump if the program crashes.

Elementary data items are defined in a PICture clause as being alphabetic (A), alphanumeric (X), or numeric (9). Plain alphabetic is rarely used.

e.g.

01 ADDRESS.

05 STREET-NUMBER PIC 9(04).

05 STREET PIC X(15).

The digit (9) or letter (X) can be repeated to indicate the size of an item

e.g. PIC XX or PIC 999

or a size can be specified in parentheses.

e.g. PIC X(02) or PIC 9(03)

The latter is preferable because it is easier to add up the sizes - if a record contains 80 characters then the sum of elementary items should be 80.

Initialization can occur at declaration time by adding a VALUE clause.

e.g. 05 STREET PIC X(15) VALUE SPACES.

"SPACES" is a reserved word - there are several in Cobol; refer to any text for a list.

Data items can also be described by editing characters for output.

e.g. 05 SOME-VALUE PIC ZZ9.

indicating a 3 digit numeric field with leading zeroes suppressed.

Floating point numbers are designated by placing a decimal point in the PICture clause.

e.g. 05 MONEY-VALUE PIC 9(03).99.

7.5.1.4. EXECUTABLE STATEMENTS

Executable code is placed in the Procedure Division. Statements are grouped logically into paragraphs. Typically, statements in the main paragraph would be of the form

PERFORM para-name.

The PERFORM verb is analogous to a GOSUB in Basic. Adding "n TIMES" would cause the repeated execution of the paragraph.

e.g. PERFORM ADD-PARA 10 TIMES.

Refer to any Cobol text for a list of COBOL verbs.

e.g. ADD INPUT-VALUE TO SUM-VALUE.

If you are mathematically inclined you can use the COMPUTE verb.

e.g. COMPUTE expression

The STOP RUN statement indicates program termination. It can be placed anywhere in the program.

7.5.1.5. INPUT / OUTPUT

Under IBM CMS you must use FIledef's at the system level to perform I/O. This type of thing is NOT required when running under VAX VMS.

Terminal I/O

Use the ACCEPT and DISPLAY verbs. Refer to the sample program attached.

File I/O

A SELECT statement in the Environment Division is used to associate the Cobol File Description with a logical filename. Select statements are generally implementation specific

- otherwise little modification is required to port a program from one system to another.

In the Procedure Division, you must OPEN a file for INPUT or OUTPUT, and CLOSE all files at the end of the program.

READ and WRITE statements are used to access records. The saying is that you "READ file and WRITE A record". Refer to the attached sample program.

7.5.1.6. COMPILING AND RUNNING

Many compilation errors are caused by AREA A violations. i.e. something is there that shouldn't be - so move it to AREA B. Often, one minor syntactical error can produce a long list of other 'errors'. e.g. Try omitting the hyphen from "Working-Storage"! Clean up the first obvious errors and generally the rest will disappear.

There are two common errors that cause a run-time error usually defined as:

ILLEGAL DECIMAL DATA

These errors are division by zero or a subscript out of range.

Other common errors are caused by inaccuracies in file descriptions.

- the logical name in the Cobol Select must match that in the CMS FIledef

- if the Cobol FD states that the RECORD CONTAINS 80 characters then the related description must total 80

- all files must be OPEN'd for either INPUT or OUTPUT


7.5.1.7. EXAMPLES

IDENTIFICATION DIVISION.

PROGRAM-ID. TEST1.

DATE-WRITTEN. SEPT. 1989.

DATE-COMPILED. 12/12/12.

*

*****************************************************

*

* PURPOSE: TO COMPUTE THE AVERAGE OF 10 NUMBERS

* INPUTS: TEN INTEGERS ACCEPTED FROM TERMINAL

* OUTPUTS: AVERAGE DISPLAYED ON THE TERMINAL

* PROCESSING: 1) REPEAT INPUT AND ADD PARAGRAPH

* 10 TIMES

* 2) COMPUTE THE AVERAGE

* 3) DISPLAY THE AVERAGE

*****************************************************

*

ENVIRONMENT DIVISION.

*

CONFIGURATION SECTION.

SOURCE-COMPUTER. IBM-4381.

OBJECT-COMPUTER. IBM-4381.

DATA DIVISION.

*

WORKING-STORAGE SECTION.

*

01 INPUT-FIELD.

05 INPUT-VALUE PIC 99 VALUE ZERO.

01 CALCULATION-FIELD.

05 SUM-VALUE PIC 9(03) VALUE ZERO.

05 AVERAGE-VALUE PIC 9(03)V99 VALUE ZERO.

01 OUTPUT-FIELD.

05 EDIT-FIELD PIC ZZ9.99 VALUE ZERO.

PROCEDURE DIVISION.

*

1000-MAIN.

PERFORM 2000-INPUT-ADD 10 TIMES.

DIVIDE 10 INTO SUM-VALUE GIVING AVERAGE-VALUE.

PERFORM 3000-OUTPUT-AVERAGE.

STOP RUN.

*

2000-INPUT-ADD.

DISPLAY "TYPE IN AN INTEGER (2 DIGITS)".

ACCEPT INPUT-VALUE FROM SYSIN.

ADD INPUT-VALUE TO SUM-VALUE.

3000-OUTPUT-AVERAGE.

MOVE AVERAGE-VALUE TO EDIT-FIELD.

DISPLAY "THE AVERAGE IS ", EDIT-FIELD.

IDENTIFICATION DIVISION.

PROGRAM-ID. DUMMY.

ENVIRONMENT DIVISION.

CONFIGURATION SECTION.

INPUT-OUTPUT SECTION.

FILE-CONTROL.

SELECT FD-IN-NAME ASSIGN TO SYS010-FILIN.

SELECT FD-OUT-NAME ASSIGN TO SYS011-FILOUT.

DATA DIVISION.

FILE SECTION.

FD FD-IN-NAME LABEL RECORD STANDARD

RECORDING MODE F.

01 IN-NAME-REC PIC X(80).

FD FD-OUT-NAME LABEL RECORD STANDARD

RECORDING MODE F.

01 OUT-NAME-REC.

05 PRINT-CONTROL PIC X(01).

05 OUT-NAME-CHARS PIC X(79).

WORKING-STORAGE SECTION.

01 FLAGS.

05 EOF-FLAG PIC X(03).

01 HEADER PIC X(80) VALUE

" CS271 TEST OUTPUT FILE LISTING ".

PROCEDURE DIVISION.

MAIN-PARA.

MOVE "OFF" TO EOF-FLAG.

OPEN INPUT FD-IN-NAME OUTPUT FD-OUT-NAME.

READ FD-IN-NAME AT END MOVE "SET" TO EOF-FLAG.

WRITE OUT-NAME-REC FROM HEADER.

PERFORM READ-PARA UNTIL EOF-FLAG = "SET".

CLOSE-PARA.

CLOSE FD-IN-NAME FD-OUT-NAME.

DISPLAY "PROGRAM COMPLETED".

STOP RUN.

READ-PARA.

MOVE IN-NAME-REC TO OUT-NAME-CHARS.

WRITE OUT-NAME-REC.

READ FD-IN-NAME AT END MOVE "SET" TO EOF-FLAG.

7.5.2. COBOL STRING HANDLING

A programmer must be able to satisfy two goals:

1) - to store data in an efficient manner,

2) - to provide an easy-to-read format for the user.

For example, a telephone number is usually displayed with a hyphen after the area code and the prefix. However the number does not need to be stored with hyphens. In Cobol, a programmer can use the STRING and UNSTRING statements to satisfy both goals: efficient data storage, and easily read output formats.

7.5.2.1. COBOL STRING STATEMENT

The STRING statement allows you to put together information from different fields. The general form is:

STRING dataname1 (or literal) DELIMITED BY dataname2 (or literal) (or SIZE)

INTO dataname3 [ON OVERFLOW statement].

The "ON OVERFLOW" clause is optional, it specifies what should happen if the receiving field (dataname3) is too small.

e.g. 1

FLD-A PIC X(08) VALUE "ABCD ".

FLD-B PIC X(10) VALUE "123 45678 ".

FLD-C PIC X(10) VALUE SPACES.

STRING FLD-A FLD-B DELIMITED BY SPACE INTO FLD-C.

After execution of the string statement:

FLD-C contains "ABCD123 "

e.g. 2

I-1 PIC X(05) VALUE "ABC ".

I-2 PIC X(05) VALUE "12-34".

I-3 PIC X(08) VALUE SPACES.

STRING I-1 DELIMITED BY SIZE

I-2 DELIMITED BY "-"

INTO I-3.

After execution of the string statement:

I-3 contains "ABC 12 "

Delimited by SIZE indicates that the whole field should be taken.

Note that the literal "-" does not become part of the new string.

7.5.2.2. COBOL UNSTRING STATEMENT

The UNSTRING statement separates the contents of one field. The general form is:

UNSTRING dataname1

DELIMITED BY [ALL] dataname2 (or literal)

INTO dataname3 ...

[ON OVERFLOW statement]

Delimited by ALL means that more than one delimiter is treated as only one. So three spaces are treated as one delimiter.

e.g. 1

FLD-1 PIC X(11) VALUE "123-45-6789".

A PIC 9(03).

B PIC 9(02).

C PIC 9(04).

UNSTRING FLD-1 DELIMITED BY "-" INTO A B C.

After execution of the unstring statement:

A contains 123

B contains 45

C contains 6789

e.g. 2

X PIC X(10) VALUE "123,,45 ".

P PIC 9(03).

Q PIC 9(03).

R PIC 9(03).

UNSTRING X DELIMITED BY "," INTO P Q R.

After execution of the unstring statement:

P contains 123

Q contains 000

R contains 45

Note the zero fill in the second destination field.

e.g. 3

NAME PIC X(17) VALUE "JOHN DOE ".

FIRST PIC X(20) VALUE SPACES.

LAST PIC X(20) VALUE SPACES.

UNSTRING NAME DELIMITED BY ALL SPACES INTO FIRST LAST.

After execution of the unstring statement:

FIRST contains "JOHN "

LAST contains "DOE "

7.5.2.3. CONCLUSION

The Cobol string and unstring statements may be used to replace many Move statements. They also provide additional flexibility in that you can move only part of a field up to a specified delimiter.

7.5.3. SUBSCRIPTING VS INDEXING IN COBOL

A program is often written to process information stored in arrays. For very large arrays, an inefficient method of referencing arrays can be quite costly.

Tables (arrays) in Cobol are referenced either by "subscripts" or "indices". Subscripts are easier to manipulate than indices but they are only offsets into tables and must be converted to addresses each time an array element is referenced, this creates extra overhead in execution time.

7.5.3.1. SUBSCRIPTS

* They are numeric values; offsets for tables

* The address of an element is computed during execution

* Any numeric variable can be used for any table i.e. the same subscript can be used for more than one array

* Syntax:

05 ROW-A OCCURS 3 TIMES.

10 A PIC 9 OCCURS 3 TIMES.

05 I PIC 9.

05 J PIC 9.

PERFORM INIT-A

VARYING I FROM 1 BY 1 UNTIL I > 3

AFTER J FROM 1 BY 1 UNTIL J > 3.

INIT-A.

MOVE ZERO TO A(I, J).

Note that a space is required after the comma in the table reference. i.e. A(I, J). If the space is not present, a compilation error results.

7.5.3.2. INDICES

* They hold the address of array elements and are therefore faster

* Tables are declared as being INDEXED BY index-variable, there is no need for a separate declaration of "index-variable" i.e it does NOT have its own PIC

* Can only be changed by using perform/varying/until or set verbs

* Can only be used to index the table they were associated with

* Syntax:

                 05 ROW-A OCCURS 3 TIMES INDEXED BY I.

10 A PIC 9 OCCURS 3 TIMES INDEXED BY J.

* 05 I PIC 9. these are no longer

* 05 J PIC 9. necessary

SET I TO 1.

SET J TO 1.

PERFORM INIT-A UNTIL I > 3.

INIT-A.

MOVE ZERO TO A(I, J).

* indices must be manually incremented

SET J UP BY 1.

IF J > 3 THEN

SET I UP BY 1

SET J TO 1.

7.5.3.3. SUBSCRIPT/RANGE CHECKING

If you compile with the SSRANGE option then the Cobol compiler will tell you if you try to access a subscript index that is outside the range of the table. For example if you have a 10 by 10 table and you try to access row 11 you will get an error.

7.5.4. Table Searching in cobol

Data retrieval is an important facet of information processing. Often valid values are stored in tables. These tables then have to be accessed to see if a particular item is in the table (and therefore is valid). COBOL supplies two forms of a "search" verb that will examine the contents of a table looking for a particular value. One form of this verb can be faster (therefore less costly) than the other.

7.5.4.1. SEARCH VERB

The SEARCH statement searches for a table element that satisfies a particular condition. It sets the value of the index to point to that particular table element.

The 2 forms of the SEARCH verb are:.

SEARCH - which performs a sequential search through the file (starts at one end and compares every element to the one being looked for until it encounters the end of the table).

SEARCH ALL - which performs a binary search of a file (starts in the middle, and keeps dividing the file in half until it finds the correct element or it cannot divide any further.) Note that the file must be sorted prior to the application of "search all".

The SEARCH ALL is usually faster than the SEARCH.

7.5.4.2. SEARCH (SEQUENTIAL)

Syntax:

SET index to 1.

SEARCH table [VARYING pointer]

[AT END statement1]

WHEN condition

statement2.

- 'table' is the name of a table.

- 'pointer' is an index-name associated with the table, or a data item USAGE INDEX, or an elementary numeric data item with no positions to the right of the assumed decimal point.

- 'statement1' is a COBOL statement that is executed if the end of the table is reached without the 'condition' being satisfied.

- 'condition' is any conditional expression that terminates the searching of the table - 'statement2' is any valid COBOL statement that is performed when the condition defined in the "WHEN" clause is satisfied.

7.5.4.3. SEARCH ALL (BINARY)

- Syntax:

SEARCH ALL table [AT END statement1]

WHEN condition

statement2.

- 'table' is the name of the table.

- 'statement1' is a COBOL statement that is executed if the end of the search is reached without 'condition' being satisfied.

- 'condition' any conditional expression that terminates the searching of the table.

- 'statement2' is any valid COBOL statement that is performed when 'condition' is satisfied.

- SEARCH ALL requires the table to be sorted according to the search key. (either ascending or descending)

- the SEARCH ALL statement ignores the current setting of the index and varies it as required for the search.

- if the 'condition' is satisfied, (the element is found) then the index points to the element that satisfied the 'condition'.

- if the 'condition' is not satisfied, (the element is not found) then the value of the index is unpredictable.

7.5.4.4. EXAMPLE

01 MEMBER-TABLE.

05 MEMBERS

OCCURS 25 TIMES

ASCENDING KEY IS MEM-NUM

INDEXED BY I.

10 MEM-NAME PIC X(20).

10 MEM-NUM PIC 9(15).

PROCEDURE DIVISION.

* Sequential Search

SET I TO 1.

SEARCH MEMBERS VARYING I

AT END DISPLAY "MEMBER NOT FOUND"

WHEN MEM-NUM(I) = user-query

DISPLAY "NAME IS ", MEM-NAME(I).

* Binary Search

SEARCH ALL MEMBERS

AT END DISPLAY "MEMBER NOT FOUND"

WHEN MEM-NUM(I) = user-query

DISPLAY "NAME IS ", MEM-NAME(I).