You are here

Evolution of a Computer Application - The Forth Language -- a Brief Introduction

John J. Wavrik

Forth is a language designed to create languages. It is easy to learn by understanding how it works! Unlike most conventional languages, its syntax is based on semantics: The emphasis is on what is done rather than on how it is said. The quickest way to introduce you to the process of programming is to examine and discuss some code, and that is what I will do in the remainder of the article. I start with some background information.


  1. The Dictionary
  2. Interpreting
  3. Seeing What Is on the Stack
  4. Compiling
  5. Terminology and Naming Convention
  6. Control Structures
  7. Data

a. The Dictionary

The Forth language is a collection of commands, which we may think of as words in a dictionary. The name for a command (which is called a "word") can be any collection of printable characters delimited by spaces. Each command has an associated action. The dictionary consists of the names and some stored representation of the action.

The process of programming in Forth consists of adding new words (and actions) to the dictionary. A glossary of standard Forth words used in this article is found in Appendix A. Individual words can be accessed using the bookmarks in Adobe Reader.

Forth maintains its dictionary in chronological (rather than alphabetical) order. The dictionary looks like a linked list: When a new word is defined, it is added at the end of the dictionary.


b. Interpreting

The Forth system has two modes. The first – interpreting – occurs when the system reads a line of input. When a word is encountered in the input stream, the dictionary is searched, starting from the most recent entry. If the word is found, its action is performed.  If a word is not found, the system tries to interpret it as a number.

Since a word's action is performed when the word is encountered, any parameters it needs for its action must be already available. Thus the words (or numbers) that are used as parameters must occur before the word in the input stream. Some means must be provided for storing these parameters – a stack is used for this.  By convention, most words act by removing their arguments from the stack and placing their results on the stack. (While numbers are not stored in the dictionary, they can be thought of as words whose action is to put themselves on the stack.)


3 2 +

This sequence (reading left to right) puts 3 on the stack, and then puts 2 on the stack. + is a word that takes two numbers from the stack, adds them, and puts the result (5 in this case) on the stack. Here are some examples with two operations:

3 2 + 7 *   leaves 35 on the stack
3 2 7 * +   leaves 17 on the stack
3 2 7 + *   leaves 27 on the stack

There is no hierarchy of operations or use of parentheses to postpone operations. The sequence 3 2 7 puts these numbers on the stack (with 7 on top). The operations + and * are performed on the top two numbers on the stack.

NOTE:  If you try these examples you will not see the results. The numbers on the stack are not usually displayed.  Read on to find out how to see numbers on the stack.

The stack is a central feature of Forth. It (rather than named variables) is the means by which words communicate with one another. In most languages data items must be named before they can be used. A typical Forth word takes some unnamed parameters from the stack, performs some operation on them, and leaves the result(s) on the stack. The action of a word is usually documented by a "stack diagram" that shows what the word expects on the stack before it acts and what it leaves after. Often the programmer adds a further note of explanation, but sometimes this is obvious from the name chosen for the word. The stack diagram for + or * is

<p class="rteindent1>(n1 n2 – n3)

which indicates that each expects two integers on the stack and leaves one integer. The names + and * indicate quite clearly what these words are intended to do.

c. Seeing what is on the stack

If you type 3 2 + (and press enter) you will not see anything!  The result is on the stack, which is not visible. There are two operations that you must know about for seeing results:

  ( n -  ) This is a dot (period). It prints the number on top of the stack (and removes it).

  ( -- ) This (dot S) prints the stack without removing anything. The numbers are printed horizontally with the top of the stack on the right.



  3 2 7 + .S [2] 3 9 ..
We started with 3, 2, and 7 on the stack, but the top two numbers were added, leaving 3 and 9.  In Win32Forth the number of elements on the stack is shown (by .S) in square brackets followed by the stack elements themselves.  Win32Forth also prints dots at the end of each line showing how many numbers are on the stack. In this case two dots indicate two numbers on the stack.

  * .S [1] 27 .
After multiplication, .S shows that there is only 1 number on the stack and it is 27.  In both cases .S does not remove numbers.

  . 27
Finally we type . (dot) and the 27 is printed and removed from the stack.

There are several words in the Forth language whose action is to rearrange the items on the stack. Here are the basic ones:

To see the number on top of the stack without removing it, you can use DUP. (DUP followed by dot). This will duplicate the number before printing it.

d. Compiling

Programming in Forth is equivalent to adding new words to the dictionary. Here is the main way new words are added. The Forth word : (colon) is used to start a definition -- it is followed by the name of the word to be added.  The definition is terminated by the Forth word ; (semicolon).  The action of the : (colon) is to create a new dictionary header and switch the system to compiling mode. The action of ; (semicolon) is to compile a termination word and to return the system to the interpreting mode. The other words in the definition describe a sequence of actions to be taken when the new word executes.  At the time of definition, these words do not carry out their action -- the action is stored in the new definition. (Note 1)


  : 2* DUP + ;

We have chosen to name the new word 2*.  The action of this word is to perform the action of DUP followed by the action of  +.  This adds the number on the stack to itself.

Compiling the action of component words results in a major speed increase without sacrificing the interactive nature of the language.  The most time-consuming step is the search of the dictionary.  The new word AddTest:

  : AddTest 3 2 + DROP ;

executes 10,000,000 times in 1 second.  The sequence of component words:

  3 2 + DROP

takes 600 seconds to evaluate 10,000,000 times. This increase of 600x in execution speed occurs because, in interpreting mode, the dictionary must be searched each time the sequence is executed. In the compiled word, the dictionary is searched only when AddTest is compiled. Compilation is incremental: It occurs only when a new word is added and affects only the new definition. (It is neither desirable nor necessary to recompile an entire system to add a new command. Note 2)  Indeed, new definitions can be made and compiled while in the midst of an interactive session. Execution speed for (compiled) Forth words is comparable to the execution speed for conventional compiled languages.

e. Terminology and Naming Convention

The words : and ; function as if they were called something like "DEFINE" and "END-DEF". (Note 3) The definition of 2* might seem clearer if we were to write

Make a new dictionary header with name 2*
   DUP +
body -- describe the action of the new word
end the definition

Forth tends to use punctuation marks as names for frequently performed operations.  Typing speed is not the only reason:  These punctuation marks are often used as part of a naming convention to indicate what new words do.  The words : and ; are often used by programmers to suggest opening and closing something: Win32Forth has words COMMENT: and COMMENT; that are used to open and close multi-line comments.

The word . (dot), used to print the number on the top of the stack, is often used as part of the name of a word that prints something. A Forth programmer who sees a word with a dot in it will automatically assume that this word is intended to print.  Later in this paper, for example, we will define a word "Set." that (as the naming convention suggests) is a word that prints a set.

It is extremely helpful, when writing Forth programs, to choose names well. A good name should suggest its action.  The use of programming conventions involving the punctuation marks is also very helpful.

f. Control Structures

Forth provides control structures, such as IF .. THEN, which are like those found in most languages.  IF and THEN are Forth words that are in the dictionary and, like other Forth words, they have associated actions. As noted already, most Forth words do not act during compilation -- they are, instead, compiled into the current definition.  Control structures change the flow of execution and are implemented by words that carry out their action during compilation. The dictionary header for a word contains both the name of the word and some extra information about it. A word can be tagged (a so-called "immediate" word) so that its action is performed immediately, regardless of the state of the system. Because these words make changes in a dictionary entry, they are used only during compilation.

The placement of words such as IF and THEN in sentences can be best understood if one knows their action. Let A, B, and C be collections of words.  We stipulate that A must leave a number on the stack -- zero is interpreted as FALSE, any non-zero number as TRUE.  We do not want to execute A, B, C,  in sequence -- we wish to execute B only if the result of A is TRUE.  A flow diagram for this behavior is

Before B there must be a test of the number on the stack.  If this number is 0 (FALSE), then we want to jump past B to the point right before C.  Our source code must indicate where the test takes place and the target of the jump.  The word IF compiles the conditional branching instruction, and the word THEN indicates the target.

Thus, in the definition of a new word, we write


This will be compiled into the dictionary entry of the word to look like this:

  code for A     ?BRANCH     target_addr     code for B     code for C  

The target address is right before the code for C.  IF compiles the conditional branching instruction and leaves space for the target address (which it, of course, does not know at the moment), so it leaves on the stack the location of the missing address.  The word THEN occurs after B and THEN knows where it is!  It also knows (because IF left the information) where it must fill in the address of where it is. So THEN fills in the proper target address -- and we are left with compiled code that looks like the diagram above.

Notice that IF and THEN do not themselves occur in the dictionary entry of the word currently being defined (although we sometimes talk as if they do).  Their job is to act during compilation. Their action is to leave behind code that will, when it executes, provide the proper control flow. 

The conditional branching instruction follows the convention of other Forth words: It removes the number tested (its argument) from the stack.     


    : ABS 
        DUP 0< IF  NEGATE  THEN  ;

This word has the stack diagram ( n - n' ), where n' is the absolute value of n. The word 0< has the stack diagram ( n - t/f ), and NEGATE has the diagram ( n - n' ) with n' = -n.

ABS negates a negative number but not a non-negative number. The definition can be written up with the stack comment and additional comments:

    : ABS  ( n - n' )
         \  n' is the negative of n
          DUP 0< IF  NEGATE  THEN  ;

Besides IF .. THEN, there are other control structures:


Forth does not have a separate program called "the compiler" -- the functionality of a conventional compiler is delegated to the action of the "immediate" words.

g. Data

The last thing that must be discussed in this brief introduction is how data are handled. The word : (colon) is called a defining word because it is used to add new words to the dictionary. There are other defining words that come with a Forth system, and Forth provides a mechanism for the user to make others. The defining words for data items create new words (children) in the dictionary and allocate memory for the data associated with those words. They also specify the run-time action of their children. The child word usually will put either the address of the data item or the item itself on the stack. For integer data, we use three defining words: CONSTANT, VALUE, and VARIABLE. Here is how each of these creates a new word in the dictionary:

Example When word is created When word is used
Create a constant xxx initialized to 5 xxx puts 5 on the stack. A constant cannot be changed once it is defined.
5 VALUE yyy
Create a value yyy initialized to 5 yyy puts 5 on the stack but 7 TO yyy will change the stored value to 7.
Create a variable zzz (some implementations initialize to 0) zzz puts the address of the storage on the stack. Words @ (fetch) and ! (store) can access and store data.


John J. Wavrik, "Evolution of a Computer Application - The Forth Language -- a Brief Introduction," Convergence (December 2004)


Journal of Online Mathematics and its Applications