Subsections

Operators

In the preamble to this chapter, it was mentioned that routines consist of two kinds: procedures and operators. See section 6.3 for details of procedures.

An operator has a mode and a value (its routine denotation) and, if dyadic, a priority. The parameters to routines which are defined as operators are called operands. Monadic operators, while not having a priority, behave as though they had a priority greater than any dyadic operator and take one operand and yield a value of some mode.

Here is an identity declaration of the monadic operator B:

   OP(INT)INT B = (INT a)INT: a

There are several points to note.

  1. The mode of the operator is OP(INT)INT. That is, it takes a single operand of mode INT and yields a value of mode INT.
  2. The symbol for the operator looks like a mode indicant. It isn't a mode indicant, but obeys the same rules (starts with an uppercase letter and possibly continues with uppercase letters or digits or underscores, and no spaces are allowed inside the symbol).
  3. The right-hand side of the identity declaration is a routine denotation. A special identity declaration is used for operators: instead of the mode of the operator, the mode constructor OP is used followed by the operator symbol. The abbreviated declaration of the operator B is

            OP B = (INT a)INT: a
    

Chapter 2 described how operators are used in formulæ. A possible formula using B could be

   B 2

which would yield 2.7.2

Identification of operators

This section is more difficult than preceding sections and could be omitted on a first reading. You are unlikely to fall afoul of what is described here unless you are declaring many new operators.

One of the most useful properties of operators is that there can be more than one declaration of the same operator symbol using an operand having a different mode. This is called “operator overloading”. How does the compiler know which version of the operator to use? Before answering this question, consider the following program fragment:

    1 BEGIN
    2    OP D = (INT a)INT:   a+2;
    3    OP D = (REAL a)REAL: a+2.0;
    4    REAL x:=1.5, a:=-2.0; INT i:=4;
    5
    6    x:=IF  OP D = (REF REAL a)REF REAL:
    7                      a+:=2.0;
    8           ENTIER(D a:=x) > i
    9       THEN D i
   10       ELSE D x
   11       FI;
   12
   13    OP D = (REF REAL a)REF REAL:  a+:=3.0;
   14    x:=D a
   15 END

The numbers on the left-hand side are not part of the program. As you can see, there are four declarations of D: one with an INT operand, one with a REAL operand and two with a REF REAL operand. If you try compiling this you will get the error

        ERROR (146) more than one version of D

for the last declaration. There are two points to be made here.

  1. Outwith the conditional clause, there are three declarations of D: on lines 2, 3 and 13. Now, an operator is used in a formula and the context of the operand of an operator is firm. Of the coercions we have met so far, only one, namely dereferencing, is allowed in a firm context. If you look at the assignment on line 14, you can see that the mode of the operand of D is REF REAL (from the declaration of a on line 4). Now a value of mode REF REAL is firmly coercible to REAL (by dereferencing). So there are two declarations of D which could be used: the declaration on line 3 and the declaration on line 13 (the range of the declaration on line 6 is confined to the conditional clause). According to the rules for the identification of operators (see below), the compiler would not be able to distinguish between the two declarations. Hence the error message.
  2. Why did the identical declaration of D on line 6 not cause a similar error message? Answer: because the declaration on line 6 is at the start of a new range: the enclosed clause starting on line 6 and extending to the FI on line 11. Since that is a new range, any operator declarations with a mode which is firmly related to the mode of an operator declared in an outer range makes the declaration in the outer range inaccessible. Thus, the assignment on line 8 will use the version of D declared on line 6, the D on line 9 identifies the D declared on line 2, and the D on line 10 again uses the D declared on line 6.

Thus, in determining which operator to use, the compiler firstly finds a declaration whose mode can be obtained from the operands in question using any of the coercions allowed in a firm context (chapter 10 will state all the coercions allowed). Secondly, it will use the declaration in the smallest range enclosing the formula.

The declaration of an object is known as its defining occurrence. Where the object is used is called its applied occurrence. In practice, it is rare to find like operator declarations in nested ranges.


Exercises

6.10
This and the following exercise use the following program fragment:
    1 IF
    2    OP T = (INT a)INT:  a*a;
    3    OP T = (CHAR a)INT: ABS a * ABS a;
    4    INT p:=3, q:=4; CHAR c:=REPR 3;
    5    T p < T c
    6 THEN
    7    OP T = (REF INT a)REF INT: a*:=a;
    8    IF T 4 < T q
    9    THEN "Yes"
   10    ELSE T REPR 2
   11    FI
   12 ELSE T c > T q
   13 FI
There are 3 defining occurrences of the operator T on lines 2, 3 and 7. There are 7 applied occurrences of the operator (on lines 5, 8, 10 and 12). Which defining occurrence is used for each applied occurrence? Ans[*]
6.11
What is the mode and value yielded by Ans[*]
(a)
T q on line 8

(b)
T q on line 12

(c)
T c on line 12

(d)
T REPR 2 on line 10

6.12
What is wrong with these two declarations occurring in the same range:
   OP TT = ([]INT a)[]INT:
           FOR i FROM LWB a TO UPB a
           DO print(a[i]*3) OD;
   OP TT = (REF[]INT a)[]INT:
           FOR i FROM LWB a TO UPB a
           DO print(a[i]*3) OD
Ans[*]


Operator usage

Before we go on to dyadic operators, there is one more point to consider. Given the operator declaration

   OP PLUS2 = (REAL a)REAL:  a+2.0

what is the mechanism by which the formal parameter gets its value? Firstly, we must remember that a particular version of the operator is chosen on the basis of firmly relatedness. In other words, only coercions allowed in a firm context can determine which declaration of the operator to use. Secondly, in elaborating the formula

   PLUS2 x

where x has the mode REF REAL, the compiler elaborates the identity declaration

   REAL a = x

where REAL a is the formal parameter. Since the context of the right-hand side of an identity declaration is strong, any of the strong coercions would normally be allowed (all coercions, including dereferencing). However, because the version of the operator was chosen on the basis of firmly relatedness, the coercions available in a strong context which are not available in a firm context (that is, widening and rowing) are not available in the context of an operand. If an operand of mode INT is supplied to an operator requiring a REAL, the compiler will flag an error: widening would not occur. This is the only exception to the rule that the right-hand side of an identity declaration is a strong context.

It was pointed out in section 6.1.5 that a routine can yield a name. An operator does not usually yield a name because subsequent use of the name usually involves dereferencing and using the value the name refers to. However, here is an operator declaration which yields a name of a multiple which is used in a subsequent phrase:

   OP NAME = (INT a)REF[]INT:
               (HEAP[2]INT x:=(a,a);  x);
   REF[]INT a = NAME 3

After the elaboration of the identity declaration, the name could be accessed wherever necessary.


Exercises

6.13
Given the declarations
   OP M3 = (INT i)INT:  i-3;
   OP M3 = ([]INT i)[]INT:
           FORALL n IN i DO n-3 OD;
   INT i:=1,[3]INT j:=(1,2,3)
which operator declarations would be used for the following formulæ Ans[*]
(a)
M3 i

(b)
M3 j[2]

(c)
M3 j

(d)
M3 j[:2]


Dyadic operators

The only differences between monadic and dyadic operators are that the latter have a priority and take two operands. Therefore the routine denotation used for a dyadic operator has two formal parameters. The priority of a dyadic operator is declared using the indicant PRIO:

   PRIO HMEAN = 7; PRIO WHMEAN = 6

The declaration of the priority of the operator uses an integer denotation in the range 1 to 9 on the right-hand side.

Consecutive priority declarations do not need to repeat the PRIO, but can be abbreviated in the usual way. The priority declaration relates to the operator symbol. Hence the same operator cannot have two different priorities in the same range, but there is no reason why an operator cannot have different priorities in different ranges. A priority declaration does not count as a declaration when determining the scope of a local name.

If an existing operator symbol is used in a new declaration, the priority of the new operator must be the same as the old if it is in the same range, so the priority declaration should be omitted.

The identification of dyadic operators proceeds exactly as for monadic operators except that the most recently declared priority in the same range is used to determine the order of elaboration of operators in a formula. Again, two operators using the same symbol cannot be declared in the same range if they have firmly related modes (see section 6.2.1).

These declarations apply to the remainder of this section:

   PRIO HMEAN = 7, WHMEAN = 6;
   OP HMEAN  = (REAL a,b)REAL:
                  2.0/(1.0/a+1.0/b);
   OP WHMEAN = (REAL a,b)REAL:
                  2.0/(1.0/a+2.0/b)

If HMEAN appears in the formula

   x HMEAN y

where x and y both have mode REF REAL, the compiler constructs the identity declarations

   REAL a = x, REAL b = y

Notice that the two identity declarations are elaborated collaterally (due to the comma separating them), which could be important (see below). If x refers to 2.5 and y refers to 3.5, the formula will yield

   2.0/(1.0/2.5 + 1.0/3.5)

which is 2.91$ \dot{6}$. Likewise, the formula

   x WHMEAN y

would yield 2⋅058 823 529 411 76. Now consider the formula

   (x+:=1.0) WHMEAN (x+:=1.0)

which causes the value referred to by x to be incremented twice as a side-effect. The resulting identity declarations are

   REAL a = (x+:=1.0), REAL b = (x+:=1.0)

The definition of Algol 68 says that the operands of a dyadic operator should be elaborated collaterally, so the order of elaboration is unknown. Suppose x refers to 1.0 before the formula is elaborated. There are three cases:

  1. The identity declaration for a is elaborated first, giving a=2.0 and b=3.0. The formula will yield 1⋅714 285 714.
  2. The identity declaration for b is elaborated first, giving b=2.0 and a=3.0. The formula will yield 1.5.
  3. The identity declarations are elaborated in parallel. In this case, the result could be indeterminate.

If you compile a program using a68toc with the declaration for WHMEAN and try to compute the formula given above, you get the result +1.5000000000000000 which suggests that case 2 holds.

If x refers to 1.0, then the formula

   1.0/(x+:=1.0) + 1.0/(x+:=1.0)

yields +.83333333333333339e +0 which is correct provided that the two operands are elaborated sequentially. The moral of all this is: avoid side-effects like the plague.

What happens if the identifier of an actual parameter is the same as the identifier of the formal parameter? There is no clash. Consider the identity declaration

   INT a = a

where the a on the left-hand side is the formal parameter for a routine denotation, and the a on the right-hand side is an actual parameter declared in some surrounding range. The formal parameter occurs at the start of a new range. Within that range, the identifier a in the outer range becomes inaccessible, but at the moment that the identity declaration is being elaborated, the formal parameter is made to identify the value of the actual parameter which, of course, is not an identifier. So go ahead and use identical identifiers for formal parameters and actual parameters.

Operator symbols

Most of the operators described in chapters 2 to 5 used symbols rather than upper-case letters. You may use any combination of the <=>*/: symbols (and any number of them) except :=, :=: and :/=: (the latter two are described in chapter 11). Any of the symbols +-?&% can only start a compound symbol. Of course, they can stand on their own for an operator. In chapter 11, you will find the << and >> operators described as well as more declarations for existing operators. Here are some declarations of operators using the above rules:

   OP *** = (INT a)INT: a*a*a;
   OP %< = (CHAR c)CHAR: (c<" "|" "|c);
   OP -:: = (CHAR c)INT: (ABS c-ABS"0")

We have now covered everything about operators in the language.


Exercises

6.14
Why are side-effects undesirable? Ans[*]
6.15
6.16
6.17
What is wrong with these operator symbols: Ans[*]
(a)
M*

(b)
%+/

(c)
:=:

6.18
Declare an operator using the symbol PP which will add 1 to the value its REF INT operand refers to, and which will yield the name of its parameter. Ans[*]


Sian Mountbatten 2012-01-19