The Minimum You Need to Know Before Reading COBOL Source Code
· Go Komura · COBOL, Legacy Technology, Business Systems, Maintenance, Mainframe
Handovers, incident response, maintaining a vendor package. In situations like these, one day a pile of COBOL source code suddenly lands on your desk.
- The file names end in
.cblor.cpy - The variable names are all uppercase
- Rows of
01,05,77,88 - Things like
PIC S9(7)V99 COMP-3appear — somewhere between an incantation and accounting software - And it is full of
COPY, so the file you opened does not even show you the whole picture
Around this point your brain turns slightly to powder.
But the map you need to read it is not that big. COBOL varies between compilers and products, yet the skeleton you should grasp first when reading an existing business system is largely the same everywhere. With IBM-style and typical business COBOL in mind, this article lays out the minimum set for people who suddenly have to read COBOL source code.
1. The Conclusion First (In One Breath)
Putting it rather crudely up front, but in a way that actually helps in practice:
- Before being a language of logic, COBOL is very much a language of record definitions
- Reading only the
PROCEDURE DIVISIONgives you half the story. Look at theDATA DIVISIONfirst PICis the shape of an item;USAGEis how it is representedCOMP-3is packed decimal. It shows up constantly in the world of amounts and counts88is not a separate variable; it is a condition-name attached to the values of the preceding itemREDEFINESis a mechanism for viewing the same memory in a different shape. It is not a copy- If there is a
COPY, the source you are looking at is not yet complete. You cannot see the whole until you open the copybooks - If you can follow
PERFORM,IF,EVALUATE,READ,WRITE, andCALL, you can grasp most of the flow - Old source is fixed format, where column positions carry meaning. The whitespace you see is not just decoration1
In short: DIVISION, PIC, USAGE, COMP-3, REDEFINES, OCCURS, 88, COPY, PERFORM. Once you can read these, your odds of getting lost drop considerably.
2. Think of COBOL First as a Language About the Shape of Data
If you read it with C# or Java instincts, you will first want to chase the ifs, fors, and function calls.
But with COBOL, before going there, it is faster to grasp “what records does this program receive, what records does it produce, and what buffers does it hold?”
A typical business COBOL program flows roughly like this:
- Read records from a file or DB
- Move them into
WORKING-STORAGEitems - Branch on conditions
- Repack them into another record
- Write them out
In other words, layout tends to come before algorithm.
For example, here is a typical skeleton.
IDENTIFICATION DIVISION.
PROGRAM-ID. SAMPLE01.
ENVIRONMENT DIVISION.
INPUT-OUTPUT SECTION.
FILE-CONTROL.
SELECT SALES-FILE ASSIGN TO ...
DATA DIVISION.
FILE SECTION.
FD SALES-FILE.
01 SALES-REC.
05 SALE-ID PIC 9(8).
05 SALE-AMOUNT PIC S9(7)V99 COMP-3.
WORKING-STORAGE SECTION.
01 WS-EOF PIC X VALUE 'N'.
88 EOF VALUE 'Y'.
PROCEDURE DIVISION.
PERFORM UNTIL EOF
READ SALES-FILE
AT END
SET EOF TO TRUE
NOT AT END
PERFORM PROCESS-SALE
END-READ
END-PERFORM
STOP RUN.
When reading this code, the first things to look at are the type of SALE-AMOUNT and the meaning of EOF — before the PERFORM.
Read COBOL in that order and it suddenly goes quiet.
3. Look at the Four DIVISIONs First
COBOL source is first divided into four large DIVISIONs.
| DIVISION | What to look at first |
|---|---|
IDENTIFICATION DIVISION |
Program name, old comments, provenance |
ENVIRONMENT DIVISION |
Files, external resources, I/O assumptions |
DATA DIVISION |
Record definitions, working areas, parameters |
PROCEDURE DIVISION |
The actual processing steps |
The especially important parts are these.
FILE SECTIONContains the record definitions for input/output filesWORKING-STORAGE SECTIONContains everyday variables, flags, counters, work buffersLOCAL-STORAGE SECTIONMay contain areas re-initialized on each invocationLINKAGE SECTIONMay contain parameters passed in from outside, the receiving end of a subprogram
If you see a LINKAGE SECTION and PROCEDURE DIVISION USING ..., there is a strong chance the program is not self-contained — it runs on data received from outside.
4. Do Not Be Intimidated by the Look of Fixed Format
In old COBOL, the column positions themselves in a source line carry meaning. If you look at it without knowing this, you will never figure out “why is there this weird margin on the left?”1
In fixed format, roughly:
- Columns 1 - 6: sequence number
- Column 7: indicator
- Columns 8 - 11: Area A
- Columns 12 - 72: Area B
Column 7 is especially important.
*or/: comment line-: continuation lineD: debugging line*>: a comment that can also appear mid-line
To lower the visual pressure, a very rough sketch looks like this.
1234567 8901 23456789012345678901234567890
* comment
IDENTIFICATION DIVISION.
PROGRAM-ID. SAMPLE01.
The whitespace here is not “formatting” in the modern sense; parts of it are syntax. Converting tabs in an editor, shifting things left, or copy-pasting carelessly will simply break it. When looking at old source, first question whether the file is fixed format or free format. Run a modern formatter over fixed-format code and it blows up quite spectacularly.
5. The Bare Minimum of the DATA DIVISION
5.1 Level Numbers
COBOL data definitions build their hierarchy with level numbers, not indentation.2
01 WS-ORDER.
05 WS-ORDER-ID PIC 9(8).
05 WS-AMOUNT PIC S9(7)V99 COMP-3.
05 WS-STATUS PIC X.
88 WS-OK VALUE '0'.
88 WS-ERROR VALUE '9'.
77 WS-COUNT PIC 9(4).
At minimum, remembering just this much is enough.
01: top-level record or group forming one unit02-49: the levels below it77: an independent elementary item88: condition-name. Attaches a name to a value of the preceding item366: forRENAMES. You will not run into it often, but it exists
What matters is not to think of 88 as a separate bool variable.
There is no separate storage area called WS-OK; rather, when WS-STATUS is '0', it can be read under the name WS-OK — that is the feel of it.
One more important thing: it is the level numbers, not the whitespace, that determine the hierarchy.
The visual indentation is a useful hint, but what you should ultimately trust is the 01 / 05 / 10 / 88.2
5.2 PICTURE
PIC expresses the shape of an item.
The ones you will see most often are these.
| Notation | Rough meaning |
|---|---|
X |
Character |
9 |
Digit |
S |
Signed |
V |
Decimal point exists only logically |
X(10) |
10 characters |
9(5) |
5-digit number |
S9(7)V99 |
Signed, 7 integer digits + 2 decimal digits |
For example:
PIC X(10)→ 10 charactersPIC 9(5)V99→ 5 integer digits + 2 decimal digitsPIC S9(7)V99→ signed, 7 integer digits + 2 decimal digits
The especially important one here is V.
V holds no actual . character.
PIC 9(5)V99 is treated as “a number with 2 decimal places,” but there is no dot character in the data.
So if you interpret a file or a dump as “the string you see,” you will almost always trip.
5.3 USAGE / DISPLAY / COMP / COMP-3
If PIC is the shape, USAGE is the representation in which the item is held.
At minimum, grasping just the following gets you a long way.45
| Notation | Rough meaning | Caution when reading |
|---|---|---|
DISPLAY |
External decimal, visible as characters | On a mainframe this may assume EBCDIC6 |
COMP / BINARY |
Binary | The visible digit count and the internal representation are different things |
COMP-3 / PACKED-DECIMAL |
Packed decimal | Looks broken if you read it as characters |
For example:
01 WS-AMOUNT-DISP PIC S9(7)V99.
01 WS-AMOUNT-BIN PIC S9(7) COMP.
01 WS-AMOUNT-PACK PIC S9(7)V99 COMP-3.
All three are “numbers,” but they hold their contents differently.
What pays off most in practice is your reflex the instant you see COMP-3.
- That is packed decimal
- Probably an amount, a tax figure, a count, or a rate
- It is supposed to look broken when viewed as text
- Eyeballing it in a CSV/UTF-8 frame of mind will cause an accident
Holding on to that understanding makes you much less likely to panic needlessly at how dumps and binary files appear.
One more small note: DISPLAY does not necessarily mean an ASCII string.
On z/OS systems EBCDIC is the assumption, so even when digits appear as characters, the byte values can differ from ASCII '0' - '9'.6
5.4 REDEFINES / OCCURS / COPY / FILLER
These four are the places where readers get stuck.
REDEFINES
REDEFINES is a mechanism for viewing the same storage area in a different shape. It is not a copy.7
01 REC-BUF.
05 REC-TYPE PIC X.
05 REC-DATA PIC X(99).
01 HEADER-REC REDEFINES REC-BUF.
05 HDR-TYPE PIC X.
05 HDR-DATE PIC 9(8).
05 FILLER PIC X(91).
This is close to the feel of a union in C-family languages.
It often appears in the style of “interpret one 100-byte area as different record types.”
OCCURS
OCCURS is an array. In COBOL it tends to be called a table.
05 WS-ITEM OCCURS 12 TIMES.
10 WS-PRICE PIC 9(5).
If you further encounter OCCURS DEPENDING ON, it is a variable-length table.
In that case it can affect the positions of the items that follow, so following it with a fixed-length mindset will make you lose your footing.8
COPY
COPY is a compile-time include.
In other words, the source you have open may not be the finished form yet.9
COPY CUSTOMER-REC.
COPY ERROR-MAP.
It is entirely normal for record definitions, shared flags, host variables for SQL, and external interfaces to be stuffed into copybooks.
When heavy use of COPY makes the source hard to read, it is faster to check whether you can get at the expanded source or a compiler listing. IBM Enterprise COBOL even has an option called MDECK for writing out the input source after library processing.10
FILLER
FILLER is an item with no name.
But “unreferenced, therefore meaningless” is wrong.
It routinely serves as:
- Reserved space
- A compatibility gap for an old specification
- Padding to match a record length
- Slack for a
REDEFINES
FILLER merely lacks a name — it still exists as bytes. Forget this and your mapping against an external file drifts out of alignment one byte at a time.
6. The Bare Minimum of the PROCEDURE DIVISION
If the DATA DIVISION is the map, the PROCEDURE DIVISION is the route you travel.
6.1 PERFORM
PERFORM is COBOL’s basic control transfer.
Roughly speaking, it means call a piece of processing and come back.11
The forms you will see most often are these.
PERFORM INIT-PROC
PERFORM UNTIL EOF
PERFORM READ-PROC
IF NOT EOF
PERFORM EDIT-PROC
PERFORM WRITE-PROC
END-IF
END-PERFORM
PERFORM comes in two broad flavors.
- Out-of-line
PERFORM, which names a paragraph or section - Inline
PERFORM ... END-PERFORM, which writes a block in place
In older code you will also routinely see range forms like PERFORM A-100 THRU A-199.
Convenient, but adding a paragraph in the middle easily drags it into the range by accident, so when reading, check carefully where the range ends.
6.2 IF / EVALUATE / Scope
For conditional branching, IF is the basic tool.
Thinking of EVALUATE as roughly a switch/case is mostly correct.
What you need to watch is how scopes end.12
Code with explicit terminators such as
END-IFEND-PERFORMEND-READ
is still the readable kind.
The problem is old code. In COBOL, . acts as an implicit scope terminator and closes all the still-open statements at once.12
That means a single period changes:
- How far the
IFextends - How far the
PERFORMextends - Where the next sentence begins
Furthermore, NEXT SENTENCE is not the same as CONTINUE.
NEXT SENTENCE jumps to the point after the next period, so its destination shifts depending on where the following . happens to be.12
When reading old COBOL, “watch the periods, not the line endings” is about the right calibration.
6.3 READ / WRITE / CALL
The frequent fliers in business COBOL are these.
READWRITEREWRITESTARTCALL
READ ... AT END ... in particular is the classic pattern.
READ IN-FILE
AT END
SET EOF TO TRUE
NOT AT END
PERFORM PROCESS-REC
END-READ
If there is a CALL 'SUBPGM' USING ..., control jumps to another program.
In that case, look at the callee’s LINKAGE SECTION and PROCEDURE DIVISION USING — the shape of the handoff becomes quite visible.
7. What Lives Outside COBOL
Quite often, COBOL’s world is not self-contained in the source.
- File definitions
- The execution environment
- DB connections
- The transaction environment
- Job control
all live outside it.
At minimum, grasping the following makes reading much easier.
Files and FILE STATUS
Read the FILE-CONTROL in the ENVIRONMENT DIVISION together with the FILE SECTION / FD in the DATA DIVISION — they come as a pair.13
SELECT IN-FILE ASSIGN TO ...
FILE STATUS IS WS-FS.
FD IN-FILE.
01 IN-REC.
05 ...
If there is a FILE STATUS, it receives the result code after each I/O.
When reading file-related failures or EOF handling, you cannot even begin without looking at this.14
EXEC SQL
If this appears, it is embedded SQL.
EXEC SQL
SELECT ...
END-EXEC.
In this case the COBOL is a “vessel for host variables,” and the actual selection criteria and update targets are on the SQL side.
So the shortcut is to read the contents of EXEC SQL as ordinary SQL.
EXEC CICS
If this appears, you are in a CICS transaction context.15
EXEC CICS
RECEIVE MAP(...)
END-EXEC.
At that instant, this stops being a plain batch-reading exercise. You need to read it together with the external context: screens, transactions, response codes, the COMMAREA, and so on.
JCL and Execution Definitions
In mainframe batch, it is not unusual for which datasets actually get allocated and in what order the jobs flow to live outside the COBOL source. When you look at the source alone and cannot tell “where is this file?”, it routinely turns out that the code is not at fault — you just have not widened your view far enough yet.
8. The Minimum Reading Order
When you suddenly have to read COBOL, the following order is the safe one.
- Sweep up every
COPYOpen the copybooks if you can. If not, look for a listing or the expanded source - Pick out the
01-level record definitions List the top-level items in theFILE SECTION,WORKING-STORAGE, andLINKAGE SECTION - Read the
PICs andUSAGEs Identify amounts, dates, counts, codes, flags - Search for
READ/WRITE/REWRITE/CALL/EXEC SQL/EXEC CICSGrasp the I/O and the external boundaries first - Follow only the first main path
Trace the chain of
PERFORMs from the top of thePROCEDURE DIVISION - Look at the
88s and status items The meanings of EOF, success/failure, and type codes become much easier to read - Mark every
REDEFINES/OCCURS DEPENDING ON/COMP-3They will matter later without fail, so flag them as hazardous material up front - For files, look at the
FILE STATUSThis eliminates a lot of misreadings around I/O errors
In this order, you avoid having to close-read the whole thing from the start. With COBOL, rather than trying to understand 100% from the beginning, it is far easier to nail down the three points — records, external boundaries, main path — and then go into the details.
9. Common Stumbling Points
Finally, here are the places where beginners get caught with very high probability.
Thinking REDEFINES is “a different variable”
It is not. It is the same storage area read in a different shape. Modify one side, and the other side’s view changes too.7
Thinking 88 is “an independent bool”
It is not.
It is just a name attached to a value of the preceding item. Behind the scenes, SET WS-OK TO TRUE stores the corresponding value into the underlying item.3
Ignoring COPY and reading only the main body
That is walking into the mountains with half the map still folded. It is entirely normal for field definitions, shared flags, and host variables to live wholesale outside the file.9
Thinking MOVE is plain assignment
MOVE is not just a memcpy.
Depending on the receiving item’s type, it can involve conversion, digit alignment, zero filling, truncation, and editing/de-editing.[^move]
Underestimating the effect of .
COBOL’s . is heavier than you imagine.
In old code with no explicit terminators, misjudging how much this period closes means misreading the control flow.12
Thinking packed decimal or EBCDIC is “mojibake”
It is not necessarily broken. Quite often it simply was never a string to begin with, or is just not ASCII.46
Assuming what follows OCCURS DEPENDING ON sits at a fixed position
The items after a variable-length table can move position depending on the value. Read it with a fixed-length mindset and all your offset calculations go wrong.8
10. Quick Reference: What to Think First
| Word you found | First thing to think |
|---|---|
01 |
Top level of a record or group. Grasp the big picture from here |
88 |
Named meaning of a flag or status code. The key to reading branches |
PIC X(...) |
Character item |
PIC 9(...) / S9(...)V... |
Numeric item. Check digit count and decimal position |
COMP |
Binary |
COMP-3 |
Packed decimal. Likely an amount or a count |
REDEFINES |
The same area being reinterpreted differently |
OCCURS |
Array / table |
OCCURS DEPENDING ON |
Variable length. Watch the positions that follow too |
FILLER |
No name, but it has length |
COPY |
You cannot see the finished form without the copybook |
PERFORM |
The skeleton of the main path |
READ / WRITE / REWRITE |
File I/O |
EXEC SQL |
DB processing |
EXEC CICS |
Transaction processing |
FILE STATUS |
I/O result code |
11. Summary
COBOL is not hard because it is old. It is just that data definitions, external files, and the execution context are tightly intertwined, which makes the initial entry point hard to see.
To restate the minimum set for reading it:
- Grasp the map via the
DIVISIONs - Read the
DATA DIVISIONfirst - Read the shape of each item via
PICandUSAGE - Mark every
COMP-3,REDEFINES,OCCURS,88, andCOPY - Follow
PERFORM,READ,WRITE, andCALL - Nail down the external boundaries via
FILE STATUS,EXEC SQL, andEXEC CICS - Do not underestimate how
.behaves
Once you can see all this, COBOL turns from “mysterious ancient magic” into “a record-processing language.” Legacy technology is not scary because the name is old; it is just that picking the wrong scale for your first look suddenly makes it hard to understand. Get the map scale right, and it reads surprisingly normally.
12. References
The main sources referenced in this article.
-
IBM, “Reference format” / IBM, “Area A or Area B” / Micro Focus, “Fixed Format” ↩ ↩2
-
IBM, “Level-numbers” ↩ ↩2
-
IBM, “Format 2: condition-name value” ↩ ↩2
-
IBM, “Examples: numeric data and internal representation” ↩ ↩2
-
IBM, “PACKED-DECIMAL (COMP-3)” ↩
-
IBM, “The EBCDIC character set” / IBM, “Handling differences in ASCII SBCS and EBCDIC SBCS characters” ↩ ↩2 ↩3
-
IBM, “REDEFINES clause” ↩ ↩2
-
IBM, “OCCURS DEPENDING ON clause” ↩ ↩2
-
IBM, “COPY statement” ↩ ↩2
-
IBM, “PERFORM statement” / IBM, “Procedure division structure” ↩
-
IBM, “Scope terminators” / IBM, “Coding a choice of actions” ↩ ↩2 ↩3 ↩4
-
IBM, “FILE STATUS clause” / IBM, “Using file status keys” ↩
Related Articles
Recent articles sharing the same tags. Deepen your understanding with closely related topics.
Windows App Outsourcing and Contract Development: What to Sort Out Before You Ask
Before commissioning Windows app outsourcing or contract development, here is how to sort out existing software modification, device inte...
What Is Reg-Free COM - Using COM Without Registration
An overview of Reg-Free COM basics, the roles of activation contexts and manifests, the benefits, the limitations, and how to decide when...
What Are COM / ActiveX / OCX? - The Differences and Relationships Explained
A practical guide to what COM is, what ActiveX is, and what OCX is - covering their differences and relationships, the connection to OLE,...
Real-Time Systems Programming in Ada — Priorities, Periodic Execution, and CPU Time Control in Practice
A practical deep dive into Ada's Annex D real-time features — task priorities, the Ceiling_Locking protocol, drift-free periodic executio...
Fable Is Gone — Don't Give Up: OpenRouter Fusion + Chinese LLMs + Review Layer
Fable is nowhere near replaceable. But combine OpenRouter Fusion with 5 Chinese LLMs, then add a review layer (GPT-5.5-Pro or Codex PR re...
Related Topics
These topic pages place the article in a broader service and decision context.
Windows Technical Topics
Topic hub for KomuraSoft LLC's Windows development, investigation, and legacy-asset articles.
ActiveX Migration
Topic page for staged decisions around keeping, wrapping, or replacing COM / ActiveX / OCX assets.
Where This Topic Connects
This article connects naturally to the following service pages.
Technical Consulting & Design Review
How to read existing COBOL assets, where to start a modification, mapping external boundaries, and forming a pre-migration assessment — these topics fit well with technical consulting and design reviews.
Bug Investigation & Root Cause Analysis
Incident response right after a handover, or tracking down where inconsistencies arise in COBOL assets, is easy to approach as bug investigation and root-cause analysis.
Author Profile
Profile page for the article author.
Go Komura
Representative of KomuraSoft LLC
Focused on Windows software development, technical consulting, and investigations into failures that are difficult to reproduce.
Public links