| |||||||||
ALGOL 68 was conceived as a successor to the ALGOL 60 programming language, designed with the goal of a much wider scope of application and a more rigorously defined syntax and semantics. Long under-recognized (probably due to its mainly European origin), the contributions of ALGOL 68 to the whole field of computer science are deep and wide ranging, although many of them were not publicly identified until they were passed, in one form or another, to many subsequently developed programming languages.
Critics of ALGOL 68 point out that it abandoned the simplicity of ALGOL 60 and became a vehicle for the various complex ideas of its designers. The language also did little to make the compiler writer's task easy, in contrast to its deliberately simple contemporaries (and competitors) C and Pascal.
Today, the different dialects of the ML programming language family, such as OCAML, bear some resemblance to ALGOL 68, while being more advanced. The ALGOL 68 heritage is also acknowledged by Scheme, whose "Revised" (with an exponent) standards echo the "Revised Report" on ALGOL.
The elemental unit of an ALGOL 68 program is the block (or closed clause):
where the keywords begin and end may be replaced by ( and ), respectively:
The specific notation for keywords is not specified in ALGOL 68, to allow for different tastes and portability (some common options being to use all uppercase letters BEGIN, underlines begin, dot prefixes .begin, and bold type begin).
All variables need to be declared, but unlike most modern languages, the declaration does not have to appear prior to the first use. The basic datatypes (called modes in ALGOL 68 parlance) are real, int, compl (complex number), bool and char. For example:
However, the declaration real pi; is just syntactic sugar for ref real pi = loc real;. That is, pi is really the constant name for a reference to a newly generated local real variable.
Furthermore, instead of defining both real and double, or int and long and short, etc., ALGOL 68 provided modifiers, so that the presently common double would be written as long real instead, for example. Type queries of the kind of max real and min long int were provided to adapt programs to different implementations.
ALGOL 68 being an expression-oriented programming language, the value returned by an assignment statement is a reference to the destination. Thus, the following is valid ALGOL 68 code:
This notion is present in C and Perl, among others. Note that twice pi is a single identifier, i.e., blanks were permitted within ALGOL 68 names (effectively avoiding the underscores versus camel case versus all lowercase issues at once).
As another example, to express the mathematical idea of a sum of f(i) from i=1 to N, the following ALGOL 68 integer expression suffices:
Note that, being an integer expression, the former block of code can be used in any context where an integer value can be used. A block of code returns the value of the last expression it evaluated; this idea is present in Perl, among other languages.
Compound statements are all terminated by distinctive (if somewhat irreverent) closing brackets:
This scheme not only avoids the dangling else problem but also avoids having to use begin and end in embedded statement sequences. Furthermore, the symbol pattern if...then...else...fi may be replaced by the more compact ...|...|... (see example below). The idea of terminating a compound statement with a backwards version of the keyword that started it is continued in the Bourne shell and its descendants.
ALGOL 68 supports arrays with any number of dimensions, and it allows for the slicing of whole or partial rows or columns, among many other choices. AlGOL 68 supports multiple field structures (structs). Union types (modes) are supported. Reference variables may point to both array slices and structure fields. For an example of all this, here is the traditional linked list declaration:
Procedure (proc) declarations require type specifications for both the parameters and the result (void if none):
or, using the compact form of the conditional statement:
The return value of a proc is the value of the last statement executed before the procedure returns. References to procedures (ref proc) are also permitted. Call-by-reference parameters are provided by specifying references (such as ref real) in the formal argument list. The following example defines a procedure that applies a function (specified as a parameter) to each element of an array:
This simplicity of code was unachievable not only in ALGOL 68's predecessor (ALGOL 60) but neither in many of its "successors" (Algol W, Euclid, Pascal, Modula, Modula-2, et al.) In fact, in several of these languages, the very intention of the previous piece of code is simply unexpressible. It is almost possible (no automatic determination of lower and upper array size bounds) in the C but at the expense of a more complex and cryptic syntax for function references, and without the type-safety of the ALGOL 68 example.
The programmer may define new operators and both those and the pre-defined ones may be overloaded. The following example defines operator max with both dyadic and monadic versions (scanning across the elements of an array).
The first declaration sets the priority of <code>max to correspond to that of the proc max of real above.
Transput is the term used to refer to ALGOL 68's input and output facilities. There are pre-defined procedures for unformatted, formatted and binary transput. Files and other transput devices are handled in a consistent and machine-independent manner. The following example prints out some unformatted output to the standard output device:
Note the predefined constants newpage and newline.
This sample program implements the Sieve of Eratosthenes to find all the prime numbers that are less than 100. The variable nil is the null pointer, as in Pascal, or a pointer with a value of zero in C. The notation x of y accesses a member x of a struct or union y.
Compared with other languages of the time, ALGOL 68 presents several difficulties to anyone who wishes to write a compiler for it. The first is the relatively large size of its grammar, which means that the parser and semantic analyser are likely to be large as well. The second is that there is no "declaration before use" rule: although all identifiers must be declared, the declaration does not have to precede the first use. This is particularly a problem with operator declarations, because ALGOL 68 allows the programmer to change the priority of operators as well as overloading them with new meanings. This means that a statement like:
might have its conventional meaning of:
or it might mean:
and the compiler would not know this for sure until it reached the end of the input program. Indeed, if the declarations of the variables had not already been seen, the compiler would not know what code would be required to implement the "*" or "+" operators, or whether those operators were even defined between the modes of b, c and d. It can be argued that such practices have little or no use outside an obfuscated code contest, but there is nothing in the language specification that would prevent them. Although some modern languages allow the programmer to overload operators, they generally do not permit the operators' priorities to be changed, precisely because of this kind of problem.
One solution to the problem of unknown operator priorities is to parse the program twice according to the same grammar. In the first pass, the compiler assumes that all operators have the same priority. This is almost certainly not true, but it allows the compiler to find all the declarations and determine the program's block structure, which establishes the declarations' scopes. The second pass uses the information gathered in the first pass to assign the correct structure to all the expressions.
Many ALGOL 68 compilers work around this problem by requiring declaration before use. Although it could be argued that such a restriction makes some programs less elegant, it does not change those programs' meanings. The exception is mutually recursive modes, where the definition of some mode a contains a ref to an instance of some other mode b, and the definition of b contains a ref to an instance of a. The solution is either to forbid mutually recursive modes (the compilers which do this often have other restrictions which, so the author(s) would have us believe, make such modes impractical anyway), or to allow a mode to be declared incompletely. That is, the programmer is allowed to write just
to tell the compiler that m exists, without saying anything more about it for the time being. The expectation is that the full definition of m will appear later, after definitions of other modes that contain ref m.