This document was made by
OCR from a scan of the technical report. It has not been edited or proofread
and is not meant for human consumption, but only for search engines. To see the
scanned original, replace OCR.htm with Abstract.htm or Abstract.html in the URL
that got you here.
Mesa Language Manual
by James
G. Mitchell William
Maybury Richard Sweet
Version
5.0 April 1979
CSL-79.3
The Mesa language is one component of a
programming system intended for developing and maintaining a
wide range of systems and applications programs. Mesa supports the development
of systems composed of separate modules with controlled sharing of information among
them. The language includes facilities for user-defined data types, strong
compile-time checking of both types and interfaces, procedure and coroutine
control mechanisms, and control structures for dealing with
concurrency and exceptional conditions.
XEROX
PALO ALTO RESEARCH CENTER SYSTEMS DEVELOPMENT DEPARTMENT
3333
Coyote Hill Road / Palo Alto / California 94304
© Copyright 1979 by Xerox Corporation
CONTENTS
CHAPTER 1. INTRODUCTION
1.1. Syntax notation
CHAPTER 2. BASIC DATA TYPES AND EXPRESSIONS
2.1.
A slice of Mesa code
2.1.1. Basic lexical structure 2.2.
Simple declarations
2.3. The fundamental operations. 4-, and #
2.4.
Basic types
2.4.1.
The numeric types INTEGER
and CARDINAL
2.4.1.1. Numeric literals 2.4.2. Type BOOLEAN
2.4.3.
Type CHARACTER
2.4.4.
The numeric types LONG
INTEGER and LONG CARDINAL 2.4.5. Type REAL
2.4.6. Relations among. basic types 2.5.
Expressions
2.5.1.
Numeric operators
2.5.1.1.
Domains of the numeric operators
2.5.1.2.
The operator LONG
2.5.1.3. CHARACTER operators 2.5.2. Relational operators
2.5.3. BOOLEAN operators 2.5.4. Assignment
expressions 2.5.5. Operator precedence
2.6. Initializing variables in declarations
2.6.1. Compile-time constants
2.7. More general declarations
CHAPTER 3. COMMON CONSTRUCTED DATA TYPES
3.1. The element types
3.1.1.
Enumerated types 3.1.2. Subrange types
3.1.2.1.
Subranges of numeric types
3.1.2.2. Range assertions 3.2. Arrays
3.2.1. Declaration of arrays 3.2.2.
Array constructors 3.3. Records
3.3.1.
Field lists
3.3.2.
Declaration of records 3.3.3. Qualified references 3.3.4. Record
Constructors 3.3.5. Default field values 3.3.6.
Extractors
3.4.
Pointers
3.4.1.
Constructing pointer types 3.4.2. Pointer operations 3.4.3.
Long Pointers
3.4.4. Automatic derefcrencing
3.5. Type determination
3.5.1.
Type conversion
3.5.2.
Balancing
3.5.3.
Free conformance
3.6. Determination of representation
CHAPTER 4. ORDINARY STATEMENTS
4.1. Assignment
statements
4.1.1.
Assignment expressions
4.2. IF statements
4.2.1. IF expressions 4.3. SELECT statements
4.3.1. Forms and options for SELECT
4.3.2.
The NULL statement
4.3.3. SELECT expressions
4.4. Blocks
4.4.1.
GOTO statements
4.4.2. OPEN clauses
4.5. Loop statements 4.5.1. Loop
control
4.5.2. GOTOS, LOOPS, EXITS, and loops
CHAPTER 5. PROCEDURES
5.1. Procedure
types
5.1.1.
Procedure values and compatibility
5.2.
Procedure calls
5.2.1. Arguments and parameters
5.2.2.
Termination and results
5.3.
Procedure bodies
5.3.1.
RETURN statements
5.4. A
package of procedures
5.4.1. The example
5.4.2.
Invoking procedures in other modules
5.5. Nested
procedures
5.5.1.
Scopes defined by procedures
5.6. Inline
procedures
CHAPTER 6. STRINGS, ARRAY DESCRIPTORS, RELATIVE POINTERS, AND VARIANT RECORDS
6.1. Strings
6.1.1.
Strine literals and string expressions
6.1.2.
Declaring strings
6.1.3.
Long strings
6.2. Array
descriptors
6.2.1.
Array descriptor types 6.2.2. Long descriptors
6.3.
Base and relative pointers
6.3.1.
Syntax for base and relative pointers
6.3.2.
A relative pointer example 6.3.3. Relative pointer types 6.3.4.
Relative array descriptors
6.4. Variant
records
6.4.1.
Declaring variant records 6.4.2. Bound variant types
6.4.3. Accessing entire variant parts. and '
ariant constructors 6.4.4. Accessing components of
variants
CHAPTER 7. MODULES, PROGRAMS, AND CONFIGURATIONS 7.1.
Interfaces
7.2.
The fundamentals of !lesa modules
7.1.1.
Including modules: the DIRECTORY
clause
7.2.1.1. Enumerating items from an included module: the USING clause
7.1.1. Accessing items from an included module
7.2.2.1.
Qualification
7.2.1.2.
OPEN clauses
7.2.3. Scopes for identifiers in a module
7.2.4.
Implications of recompiling included modules
7.3.
DEFINITIONS modules
7.3.1.
Interface variables
7.3.2.
Default fields in interfaces
7.3.3. lnline procedures in interfaces
7.3.4.
Usage hints for inline procedures in interfaces
7.4.
PROGRAM modules:
IMPORTS and
EXPORTS
7.4.1. IMPORTS, interface types, and
interface records
7.4.2.
Importing program modules
7.4.3.
Exporting interfaces and program modules
7.4.4.
IMPORTS in DEFINITIONS
modules
7.5. Controlling module interfaces: PUBLIC and
PRIVATE
7.5.1.
Access attributes in declarations
7.5.1.1.
Declared names
7.5.1.2. Names specified in field lists
7.5.1.3. Names for variant parts and for tags in variant
records 7.5.2. Access attributes in TYPE definitions
7.5.3. Default global access
7.5.4. Accessing the PRIVATE predefined symbols of other
modules 7.6. The Mesa configuration language. an introductory example 7.6.1.
Lexicon: a module implementing LexiconDefs
7.6.2.
LexiconClient: a client module
7.6.3. Binding, loading, and running a configuration: an
overview 7.6.4. A configuration description for running
LexiconClient 7.7. C/111esa: syntax and semantics
7.7.1.
IMPORTS, EXPORTS, and
DIRECTORY in
C/Mesa
7.7.2.
Explicit naming. IMPORTS,
and EXPORTS
7.7.3.
Default names for interfaces and instances
7.7.4. Multiple exported interfaces from a single
component 7.7.5. Multiple components implementing a single interface
7.7.6. Nested (local) configurations
7.8.
Loading modules and configurations: NEW and START
7.8.1.
The NEW operation
for making copies of modules 7.8.2. How the loader binds
interfaces
7.8.3.
STARTing, sToPping,
and RESTARTing module
instances 7.8.4. Loading
and starting configurations
CHAPTER 8. SIGNALLING AND SIGNAL DATA TYPES 8.1.
Declaring and generating SIGNALS and ERRORS 8.1.1. ERROR in expressions
8.2.
Control of generated signals
8.2.1. Preparing to catch signals: catch phrases
8.2.2. The scope of variables in catch phrases 8.2.3.
Catching signals
8.2.4. RETRY and CONTINUE in catch phrases 8.2.5.
Resuming from a catch phrase: RESUME 8.3. Signals within signals
CHAPTER
9. PORTS AND CONTROL STRUCTURES
9.1. Syntax and an example of PORTS
9.2. Creating and starting coroutines
9.2.1. The CONNECT statement
9.2.2. Low-level actions for a PORT call 9.2.3.
Control faults and linkage faults 9.2.4. Saving arguments
during faults
9.3.
RESPONDING PORTS
CHAPTER 10. PROCESSES
AND CONCURRENCY
10.1.
Concurrent execution. FORK
and JOIN
10.1.1. A process example
10.1.2.
Process language constructs
10.2.
Monitors
10.2.1.
An overview of monitors 10.2.2. Monitor locks
10.2.3.
Declaring monitor modules, ENTRY and INTERNAL procedures 10.2.4. Interfaces to
monitors
10.2.5. Interactions of processes and monitors
10.3.
Condition variables
10.3.1. Wait, notify, and broadcast
10.3.2.
Timeouts
10.4. 1Alore about monitors 10.4.1.
The LOCKS clause
10.4.2. Monitored records
10.4.3.
Monitors and module instances
10.4.4.
Multi-module monitors 10.4.5. Object monitors
10.4.6.
Explicit declaration of monitor locks
10.4.7. lnline ENTRY procedures 10.5. Signals
10.5.1. Signals and processes 10.5.2.
Signals and monitors 10.6. Initialization
APPENDICES
A. Pronouncing Mesa
B. Programming
Conventions
B.1.
Names
B.2.
Layout B.3. Spaces
C. Alto/Mesa Machine Dependencies
C.1. Numeric
limits
C.2. AltoDefs
C.3. ASCII
character set and ordering of character values
C.4. Alto/Mesa
STRING procedures
D. Binder Extensions
Di Code packing 1).1.1. Syntax 1).1.2.
Restrictions
112. External links 1/2.1. Syntax 112.2.
Restrictions
E. Mesa
Reserved Words F. Collected Grammar
INDEX
Preface
This
document describes the Mesa programming language. Its
approach is tutorial, and it is intended to be read
somewhat as a textbook. It is neither a user's guide nor a reference manual.
Elements of
Mesa Style, by James Morris. is a recommended supplement to
this manual. The style manual contains several examples of
well-constructed Mesa programs, with explanations of their development
and commentary on using the language properly. Its purpose is to
provide assistance in using the features of Mesa to write programs
that work reliably and are easily maintained.
Programmers. should also read the Mesa Users Handbook, which
provides an introduction to the use of the Mesa system and a
guide to other documentation. The Mesa
System Documentation and Mesa
Debugger Documentation describe facilities available in the Alto
implementation of the Mesa programming system. These include input and
output, which are done procedurally and are not built into the
language.
Suggestions, corrections and criticisms concerning the
style and content of this manual are encouraged and
should be sent to your support group.
Acknowledgements
The
Mesa language was designed and first implemented by the Computer Science
Laboratory of the Xerox Palo Alto Research Center. The principal
participants were Butler Lampson, James Mitchell,
Edwin Satterthwaite, Charles Geschke, and Richard
Sweet. Subsequent
development and
maintenance
of the language and compiler have been done by the Systems Development
Department of Xerox. In addition to the original participants, John Wick and
Richard Johnsson have made major contributions.
The original version of this manual was written by
William Maybury and edited by James Mitchell. To reflect changes
in the language, many sections have since been added or revised. James Mitchell
and Richard Sweet have served as editors, with additional
contributions by Edwin Satterthwaite, John Wick and Richard
Johnsson. David Redell wrote most of Chapter 10.
Vicki Parish, Gail Pilkington, Janet Farness, and Ode
Binkley helped greatly with manuscript preparation and formatting:
Gail Pilkington, James Sandman, Barbara Koalkin, and Bruce Malasky assisted
in proofreading and indexing.
CHAPTER
1.
INTRODUCTION
This manual concentrates on the Mesa programming
language. Mesa is really a programming system of which the
language is but one part. Other components of the system are documented
separately, as are the details of preparing, compiling, debugging and
running Mesa programs.
Each chapter of this manual discusses some aspect of the
language, using examples as well as descriptions of semantics
and syntax. The chapters emphasize different language features and provide
different levels of detail. The complete treatment of some features requires
more than one chapter. Generally, earlier chapters introduce
topics, and later ones supply additional detail. Titles of
chapters, sections and subsections indicate the language issues with which they
deal.
In each major section, information is presented
at three levels:
(1)
Ordinary usage (motivation, forms and
semantics), frequently with examples.
(2)
Syntax equations (when appropriate).
(3) Fine points (if
applicable): restrictions, special cases, references to later material, precise
semantics, etc.
Level (1) is intended to offer a basic understanding of
Mesa. Reading only first level material should be adequate to begin
programming in the language. Levels (2) and (3) supply more detail and
provide information about the full power of Mesa.
As a rule. these levels of discourse occur separately and
in the indicated order. A section with a heading followed by an
asterisk (*) deals with specialized material that can be skimmed or skipped entirely
on first reading. Occasionally, fine points or syntactic details are presented
within first-level material. The reader will be able to distinguish
between levels by their appearance. Fine points are written
in a small font. like this. Syntax
equations and syntactic categories appear in the following font: FontForSyntax.
Any italicized word or phrase is important. If
a Mesa technical term is being introduced, it will be in italics:
if a term is used before being defined, it will be italicized to warn the
reader that it should not be taken lightly and that it has a particular
meaning in Mesa. Occurrences of a technical term, once
defined, are not distinguished. Lastly, names appearing in programs are
italicized in both the program text itself and the explanations of that
text.
Programming examples are indented relative to the
surrounding text to distinguish them.
1.1. Syntax notation
Mesa's grammar is described by syntax equations
written using a variation of Backus-Naur Form (or BNI). For those unfamiliar with
BNF, an explanation follows. Reading and understanding that explanation
is imperative for full use of this manual; in a first reading. details of the
syntax equations can safely be skipped. Those familiar with BNF
should scan this section to discover the particular variation being used.
An individual syntax equation defines a portion
of the Mesa _grammar. It specifies a rule for forming
some class of phrases in the language. A phrase class has a name. e.g., Program. and is defined
by one or more syntax equations. Phrase names are always printed in the syntax
font when their use is meant to be technically accurate. For
example, an OctalDigit, which can be any of 0, 1,
2, . . 7, is defined by
the equation:
OctalDigit :: =
011121314151617
Each equation consists of a phrase name on the left,
followed by the operator :: (which should be pronounced is
defined to be"), in turn followed by a formation rule for that phrase class A formation
rule consists of one or more alternatives,
separted by the syntactic operator vertical bar, I (which
should be pronounced "or"). The ordering of alternatives is not
important. In the definition of OctalDigit,
"3" is an alternative.
Each alternative is a sequence of symbols, where a symbol is either a
phrase name (in the syntax font) or a syntactic literal. In a syntax
equation, a literal symbol stands for itself. The reserved words
of Mesa, such as BEGIN, appear as literals; they
are always written using upper-case characters
in the font shown. The digits 0, 1, 2, etc. and special
characters, such as =, + and also are used to form literal
symbols. Some composite symbols are formed from more than one special
character, e.g., =>. Spaces in syntax equations are used only to
separate the items in the rules and have no special
significance.
The phrase name empty is
often used as one of the alternatives in a formation rule. It means that
the rule permits an ''empty" phrase as one of its
alternatives (i.e., an actual phrase is optional; it
may or may not
occur in
the result of applying the formation rule).
Comments embedded in syntax
rules are preceded by a double dash. --, and appear to the right, e.g.,
Digit =
OctalDigit 1819 -- a decimal digit is
an OctalDigit
or
an
8 or a 9
Often, only part of the total definition of a phrase
class is given. To indicate that there are other ways of forming
phrases of that class, an ellipsis (...) is used as an alternative within the
rule. The definition of Statement is distributed throughout
much of the manual in this way. When a certain statement form,
such as the AssignmentStmt, is being discussed, the following partial
rule appears:
Statement = AssignmentStmt
I -- this
is just an example.
One can read this as, "A Statement is
defined to be an AssignmentStmt, among other things."
Within a single alternative, the order of symbols is
important. The alternative acts as a "template" for
forming an actual phrase; literal names and literal characters are copied, while substitutions are made
for the phrase names. Consider the following example:
ReturnStmt = RETURN I RETURN Constructor
("A
ReturnStmt is defined to be RETURN or RETURN followed by a Constructor.")
The second alternative means that RETURN and
some actual phrase defined by Constructor occur in exactly that
order.
Syntax equations can indicate recursive substitution: for example:
IdList =
identifier I identifier , IdList
In a Mesa program,
an identifier is basically a name. This equation defines an IdList to
be a list of one or more names, with commas separating them if
there is more than a single name in the list.
This result is explained as follows. The
formation rule for IdList consists of two alternative rules:
Rule 1: (First alternative) "An
IdList is defined to be an identifier". i.e.. any one name can replace an IdList
Rule
2: (Second alternative) "An IdList is defined to be an
identifier followed by a comma followed by another IdList". i.e.. name. IdList
can replace an IdList.
To
derive a single name. use Rule 1 as shown below. (Note: The substitutions are
emphasized by writing them in italics.)
IdList ::= name (by Rule 1)
To
derive two names separated by a comma:
IdList ::= name, IdList (by Rule 2)
name, name (by Rule 1)
To
derive three names separated by commas:
IdList ::= name, IdList (by Rule
name, name, IdList (by
Rule
name. name, name (by
Rule
To derive n names separated by commas. use Rule 2
2)
2) 1)
n-1 times and then use Rule 1.
The following syntax equation also relies on recursion:
StmtSeries =
empty I Statement I Statement ; StmtSeries
The
equation is read as, "A StmtSeries is defined to be empty, or a
single statement, or a series of statements separated by
semicolons: the last statement may be followed by a semicolon."
A
trailing semicolon is possible because:
1)
A StmtSeries may take
the form specified by the third alternative. "Statement : StmtSeries".
2) After some number of further
substitutions using the third alternative. the recursive reference to
StmtSeries may take
the "empty" form. i.e.. ".. Statement :
empty".
3) empty is replaced by nothing at all, i.e.,
"... Statement :".
Commas and semicolons are used as major separators for a
variety of constructs in Mesa. To distinguish between such
constructs, a convention is adopted that the suffix "List" on a
phrase name implies a sequence separated by commas, while.
"Series" implies a sequence separated by semicolons.
This convention is reflected by the phrase names IdList and StmtSeries
above.
CHAPTER
2.
BASIC DATA TYPES AND EXPRESSIONS
This chapter presents some of the fundamentals of Mesa. It
discusses how to declare, initialize and assign values to variables. It also
describes the basic types for numeric, character and Boolean data, as
well as the operators used to construct expressions having these types.
The Mesa language is strongly typed. The programmer is given a collection of
predefined types and the ability to construct new ones; he is
encouraged to choose or invent suitable types for each particular
application. Every variable used in a Mesa program must be declared to have one of these
types; every constant has a type; and every expression has a type derived from
its components and context. All types can be deduced by static
analysis of the program, and the language requires that
each value be used in a way consistent with its type according to rules
specified here and in chapter 3. The type of an object determines its
representation and structure as well as the set of applicable
operations. In addition, the type system can be used to partition the universe
of objects and avoid confusion, even among classes of objects that
are represented identically.
2.1. A slice of Mesa code
The example below is an excerpt from a Mesa program. It
assigns to gcd the
greatest common divisor (GCD) of a pair of integers. in and n (where in, n and gcd are integer variables in the
program from which this excerpt was taken; we assume their
values need not be preserved). The example uses the Euclidean
Algorithm for finding the GCD of two numbers and works as follows:
If both in and n
are zero, the GCD is zero (by convention).
Otherwise, repeat the following until ii is zero: find the remainder
of dividing in by
set in to
the value of n: then
set n to
the remainder. The final value of in
is the GCD of the original in and n except that it may be
negative; taking its absolute value gives the GCD.
Example. Slice of Mesa Code Using the
Euclidean Algorithm
-- Given are integers in and n, which can be altered. |
(1) |
IF 111=0 AND 11=0 THEN gcd 4-
0 -- by
convention |
(2) |
ELSE |
(3) |
BEGIN |
(4) |
r: INTEGER; |
(5) |
UNTIL n= 0 |
(6) |
DO |
(7) |
r in
MOD n: r gets remainder of in/ |
(8) |
in 4- 12: n r -- update variables |
(9) |
ENDLOOP ; |
(10) |
gcd -- in case one of in or n was negative -- ABS[nij: |
(11) |
END; |
(12) |
The example contains twelve lines of source code.
including comments. The numbers in parentheses at
the right side are for reference only and are not part of the source-code.
Comments begin with the symbol
"--" and terminate at line endings. They may also be completely
embedded within lines, in which case they both begin and end with
"--''.
Line (2) begins an IF statement that uses
the values of in and to
select between two alternatives. If
both values are zero, the assignment statement following THEN is
executed: it assigns the value 0 to
gcd (the
character ".-." is Mesa's assignment operator). If either is nonzero.
the assignment is skipped and the compound statement following ELSE (lines (4) through
(12) inclusive) is executed. (Distinguishing
the two cases is actually unnecessary, but doing so illustrates more features
of Mesa.)
The second alternative is a block, a series of
declarations followed by a series of statements, all bracketed
by "BEGIN" and
"END". Line
(5) declares a variable r of
type INTEGER for
use within that block. A semicolon separates the declaration from the
statements that follow it.
The iteration in the algorithm is performed by the loop (UNTIL n=0
DO...ENDLOOP). which
contains three embedded assignment statements. The loop repeats
until n is
equal to zero. If it is zero at the outset, the embedded
statements are not executed at all. Statements are separated by semicolons. A semicolon
at the end of a statement series that is embedded in another statement (such as
the series in the loop) is optional; it is permissible to write a
semicolon after every statement
in the series.
Within the loop, line (8) assigns to r the value of the expression "In MOD a", which
gives the remainder of dividing in by n.
Line (9) updates In to
contain the previous value of n and then updates n for the next iteration, if
any. Control transfers from the end of the loop, line (10), back to line
(6), where the new value of n is
tested. If it is not zero, the loop is repeated; otherwise, execution
continues with the first statement following the
loop, line (11).
When control reaches the assignment statement in line
(11), in either
has its original value (if 17 was
zero) or contains the value a had just before it became
zero. The expression "Aes[n]" has the form used
for calling a function and passing it one or more arguments; square
brackets enclose the
argument list. Normal parentheses, "("
and '')". are used
only for nested expressions, e.g.,
"a*(b+c/(d— e)*f)." The
assignment places the absolute value of i» into gcd; this is the correct
result. At this point, the reader is urged
to trace through the example with initial values for m and
n of
15 and 12, respectively; the result should be gcd=3.
2.1.1. Basic
lexical structure
The
names gcd. in, n and
r in
the example are called identifiers. The general form of an identifier
is given by the following (informal) syntax:
An identifier is a sequence consisting of' any
mixture of upper-case letters, lower-case letters or digits, the
first of which is a letter. Upper and lower case letters are different and do
distinguish identifiers.
The following, valid identifiers are
all distinct:
aBc Abc
DiskCommandWord display Vector mach 1 x32y40
Identifiers consisting entirely of capital letters are
reserved for use by the Mesa language. Some, / such as IF, are punctuation
symbols: others name built-in types. such as INTEGER, or functions, such as
ABS. All
such words that have special meaning and are not to be defined by
the programmer are
called reserved
words. It is legal for the programmer to use fully
capitalized identifiers, but he risks a clash with a reserved word
(possibly a new one in some future version of the language). To avoid this.
at least one digit or lower case letter should appear in any identifier.
Appendix E lists the current set of reserved words.
Mesa uses the blank
(or space) character to separate basic lexical units of the language (such as reserved
words and identifiers). Blanks are
significant separators of lexical units. They may not he embedded
in identifiers, composite symbols (such as >= ), or numeric literals (such
as 1000). Blanks are meaningful in STRING constants
(section 6.1.1). and there is a CHARACTER constant for space
(section 2.4.3). As a separator. any sequence of contiguous blanks
is equivalent to a single
blank. A TAB
character also behaves exactly as a blank when
used as a separator.
A carriage-return character behaves as
a blank for separating lexical units also, but it has one extra function:
if the last part of a line is a comment. the carriage return acts as
the terminator of that comment. Thus, multiline comments (those
containing carriage returns) must begin with "--" on each
new line. Line breaks have no significance as statement separators.
For example, the single loop statement in the example extends over a
number of lines, and a semicolon is used to separate two
statements in a series.
Semicolons are used
for separating declarations, for separating a series of declarations from
following statements, and for separating statements in a series from one
another. They cannot be used with abandon, however: care is necessary when writing IF statements (sec. 4.2.1) or SELECT statements (sec. 9.3.1). Multiple statements can be written on a single line, separated by
semicolons.
2.2.. Simple
declarations
The example (Euclidean Algorithm) contains the following
declaration:
r: INTEGER:
This declares r to be a variable of type INTEGER (sec.
2.4.1), one of Mesa's built-in types. More than one variable can be
declared at the
same time. For instance,
x, y, divisor: INTEGER;
declares identifiers x, y and
divisor as
variables of type INTEGER. These examples reflect the
two primary purposes of every declaration:
to designate one or more identifiers as
variables, and to specify their type. -
A declaration always begins with a single
identifier or a list of identifiers. Conventionally. "list" is used
to denote a single item as well as multiple items separated by commas. An identifier list (IdList)
is defined as follows:
IdList = identifier I
identifier, IdList
A declaration begins with an IdList followed by a colon. The
colon is followed by a type specification (INTEGER, for
instance, is a type specification).
2.3. The fundamental operations: assignment, equality and
inequality The example contains the following five assignment
statements:
gcd 0
r ni MOD n
m 4- n
n 4- r
gcd ABS[M]
An assignment statement has the following syntax:
AssignmentStmt =
LeftSide 4-
RightSide I
...
LeftSide =
identifier I ... -- plus
forms for array indexing. etc.
RightSide = Expression
The
RightSide may
be any expression (section 2.5) provided that its type conforms to that of the LeftSide.
"Conforms" is defined in section 2.4.6 and is discussed further in
section 3.5: for now. it can be taken to mean: "is the same as."
The LeftSide may be a simple variable or a component of an
aggregate variable (such as an element of an array). In
any event. a LeftSide denotes a variable, something capable of
receiving values. A LeftSide cannot, for example, be a constant, while a
RightSide can.
The assignment operation ( <-.), the equality
operation (= ) and the inequality operation (#) are called the fundamental operations. They
can be applied to values of most types (including, for instance,
entire arrays). The rules governing which pairs of operands may be
used in a fundamental operation are detailed in section 3.5.
2.4. Basic types
The
types of variables in a Mesa program fall into two broad classifications, built-in types and user-defined
types. Chapter 3 describes how a programmer can define new
data types using type constructors. This section
discusses the basic, built-in types. These include several numeric types
(INTEGER, LONG INTEGER,
CARDINAL, LONG CARDINAL and REAL). a type for logical values (BOOLEAN), and
a type for individual character values (CHARACTER). The built-in type STRING (for
sequences of characters) is described in chapter 6.
2.4.1. The numeric types INTEGER and CARDINAL
Mesa provides two standard numeric types, one with values
ranging over the signed integers; the other,
over the unsigned integers. Neither type completely mirrors the
corresponding mathematical abstraction (the integers Z or the natural numbers N,
respectively) because a finite representation is used for values of
each type. The range of the type INTEGER is (approximately) symmetric
about zero, and values of type INTEGER are represented as signed numbers. The range of the
type CARDINAL is
some finite interval of the natural numbers that includes zero, and values of
type CARDINAL are
represented as unsigned numbers.
"Signed" and "unsigned" are not types; rather,
they describe the machine
representation of a numeric value.
The programmer must choose an appropriate type for each
numeric variable. CARDINALS
offer a somewhat greater
positive range than INTEGERS,
and this is significant in a few applications, e.g., those
that manipulate addresses that might be the same size as the word size. More
importantly, declaring a variable to have type CARDINAL asserts
that its value is always nonnegative; the compiler can
use such assertions to perform more checking and to generate better code.
Programmers are encouraged to declare as much information about
each variable as possible: the ranges of numeric variables can be
further constrained by using subrange types (section 3.1.2).
The types INTEGER and CARDINAL are distinct and not
interchangable. They are. however, closely related. Mesa
allows most combinations of these types to occur within assignments and
arithmetic expressions (but not relational expressions). Care is necessary to
avoid ambiguity and failures of representation when values
with different representations are mixed. This is discussed further in sections 2.4.6 and 2.5.1.1.
2.4.1.1. .Vumeric
literals
A !runlet* literal is
an instance of the phrase class number, defined as follows:
A number is a sequence of digits. The digits may
optionally be followed by the letter B or B. which in turn may
optionally be followed by another sequence of digits denoting a
scale factor. No spaces arc allowed within numeric literals.
If I) is specified explicitly, or if neither B nor I)
appears. the number
is treated as decimal. The letter 13 means
the number is
octal (radix 8). A scale factor indicates the number of zeros to be appended
to the first sequence of digits: the
scale factor itself is always a decimal number. The
literals below all denote the same value:
6400 MOOD 641)2 14400B 144B2
A numeric literal always denotes a nonnegative
number (i.e., — 5 is considered to be an expression in which
the unary negation operator is applied to the literal 5 to produce an INTEGER value).
To be valid in a context requiring a CARDINAL, the
value of the literal must be a valid CARDINAL number. Similarly,
if an INTEGER is
required by context, the value must be a valid (positive) INTEGER. (See
section 2.4.4 for more details)
2.4.2. Type BOOLEAN
A BOOLEAN value can be either TRUE or
FALSE,
and these are the only literals of type BOOLEAN: i.e., BooleanLiteral :: = FALSE I TRUE
BOOLEAN expressions
are used in conditional statements (following IF) and in certain loop
constructs. For instance, the
following skeletal form describes the flow of control in Example 1:
IF
n?=0 AND n=0 THEN . . . ELSE
. . .
UNTIL 17= 0
DO
. . .
ENDLOOP;
The
expression "n=0" is
a BOOLEAN expression:
its value is TRUE if
the value of n is
zero and FALSE otherwise.
The expression "m=0 AND n=0" is also a BOOLEAN expression: its value is TRUE just
if both relations are. The relational and logical
operators discussed in sections 2.5.2 and 2.5.3 all yield BOOLEAN values.
Variables of type BOOLEAN can be assigned values
and appear as operands (although arithmetic operators) just
as any other Mesa variables. For example, the above program could validly be
replaced by the following: mIsZero. nlsZero: BOOLEAN; mIsZero (m=0): nlsZero n=0: -- compute whether in and n are zero IF ,nlsZero AND nlsZero THEN . . ELSE • • • |
not of outline |
|
UNTIL nIsZero=
TRUE
DO . . . nlsZero n=0; ENDLOOP; |
-- equivalent to just nlsZero by itself -- recompute whether n is zero just before testing |
|
2.4.3. Tpe CHARACTER
A value of type CHARACTER represents a single character of text. CHARACTER values are ordered I (according
to the order specified in appendix C) and can be compared using the normal
arithmetic relations. CHARACTER values are distinct from numbers. and they cannot
be assigned to variables with numeric types. Limited arithmetic is, however, allowed on
characters (section 2.5.1.2).
A characterLiteral is
written as an apostrophe C) immediately followed by a single character (which
can be a blank, carriage-return, semicolon. apostrophe. or any other character)
or as an octal number followed by C. For example:
lowerCaseA 'a: mark 4-
enalarker asciiCR 4-
15C:
2.4.4. The numeric types LONG INTEGER and LONG CARDINAL *
For some applications, the ranges of the numeric types
introduced in section 2.4.1 are too limited. Mesa provides both
a predefined type LONG
INTEGER, with signed representation, and a predefined type
LONG CARDINAL, with
unsigned representation, for such applications. These types offer greater ranges.
but their values occupy more storage and are generally more
time-consuming to manipulate than those of the
previously introduced numeric types.
In an implementation, values of types INTEGER and
CARDINAL are
expected to be represented by single machine words, while
values of types LONG
INTEGER and LONG CARDINAL are expected to occupy two words.
For this reason, INTEGER
and CARDINAL will be referred to as short numeric types: LONG INTEGER and
LONG CARDINAL, as
long numeric types. On
a machine using two's complement
arithmetic and a word length of N bits, the following table indicates the range
spanned by each numeric type (".." replaces the
mathematician's comma in this interval notation):
INTEGER [—
2N-1
CARDINAL [0
.. 2N)
LONG
INTEGER [_2 22N-1)
LONG
CARDINAL [0
..
The actual ranges for these types are given
in appendix C. the machine dependencies appendix.
Long numeric constants are denoted by numeric literals
defined by the phrase class number (section
2.4.1.1). The allowable type of any decimal or octal literal is determined by
its value, as summarized by the following table (using
the conventions introduced in the preceding paragraph):
Range [0 .. 2N-1) [2N-1 .. 2N) [2'..2211) p2N-1 22N) |
Allowable Types INTEGER,
CARDINAL, LONG INTEGER, LONG CARDINAL CARDINAL, LONG INTEGER, LONG CARDINAL LONG INTEGER. LONG CARDINAL LONG CARDINAL |
As in the case of short numeric types, the types LONG INTEGER and
LONG CARDINAL are
distinct but closely related. Mesa allows most combinations of
these types and the types INTEGER and CARDINAL to occur within assignments,
arithmetic expressions and relational expressions. but care is necessary when
this is done (see sections 2.4,6 and 2.5.1.1).
Type REAL (interim) *
The values of Mesa's type REAL are approximations
of mathematical real numbers. These approximations arc sometimes called floating-point numbers. For the
current version of Mesa. a standard representation for floating-point
values has not been chosen. The language nevertheless provides
some help with floating-point computation. It allows declaration and assignment
of REAL values, and REAL expressions constructed using the standard infix
operators are converted to
sequences of procedure applications by the compiler.
A REAL
value is assumed to occupy the same amount of storage
as a LONG INTEGER (i.e.,
two words). Beyond this, no assumptions are made about the
representation of REALS.
There are no literals with type REAL. Users
of real arithmetic must provide an appropriate set of procedures for performing
the arithmetic and relational operations.
Although Mesa provides no denotations of REAL literals,
it does provide automatic conversion from INTEGER, LONG INTEGER, CARDINAL or
LONG CARDINAL to REAL (section
2.4.6). Thus numbers (numeric literals) can
appear in REAL expressions
and provide denotations of certain REAL constants.
2.4.6. Relations among basic types *
If two types are completely interchangable,
they are said to be equivalent. A
value having a given type is acceptable in any context requiring a
value of any other type equivalent to it: there is no operational
difference between two equivalent types. None of the basic types discussed in
section 2.4 is equivalent to another basic type.
One type is said to conform to another if any value of
the first type can be assigned to a variable of the second type. A
type trivially conforms to itself or to any type equivalent to itself. In more interesting
cases, an automatic application of a conversion function may be required prior
to the assignment. Conformance
and its implications are discussed further in section 3.5.
There are nontrivial conformance relations involving the
types INTEGER, LONG
INTEGER, CARDINAL, LONG
CARDINAL and REAL. These relations allow certain combinations of
the numeric types to be mixed, not only in assignments but also in
arithmetic and relational operations (section 2.5). They also
permit these types to share denotations of constants (section 2.4.4). The
conformance relations can be summarized as follows:
INTEGER
and CARDINAL conform to INTEGER.
INTEGER
and CARDINAL conform to CARDINAL.
INTEGER,
LONG INTEGER, CARDINAL and LONG CARDINAL conform
to LONG
INTEGER. INTEGER, LONG INTEGER, CARDINAL and LONG CARDINAL conform
to LONG
CARDINAL. INTEGER, LONG INTEGER, CARDINAL, LONG CARDINAL and
REAL conform
to REAL.
Pairs of numeric types not on this list do not conform:
e.g., it is not possible to assign a LONG
INTEGER to an INTEGER or
a REAL to
a CARDINAL.
Particular care is required when numeric types with
different representations are intermixed. Mathematically, Z D N: however. it is not
necessarily true that INTEGER
D CARDINAL
or that LONG INTEGER D
LONG CARDINAL. For
instance, with the assumptions above, the intersection of INTEGER and
CARDINAL is
[0..2/v-1). Within this interval, the signed and unsigned representations
agree. and the interpretation of a short numeric value is
unambiguous. If a CARDINAL value lies in this range, it can validly be .assigned to an INTEGER variable, and vice-versa: outside this range, the
value represented
w
by a given
word depends upon whether it is viewed as a CARDINAL or as an INTEGER. Similar
considerations apply to LONG CARDINAL and LONG INTEGER.
Example:
With the assumptions above and A1=16, the
unsigned value 177777B and the signed value — 1 are encoded by the
same bit pattern.
Assignment of an unsigned value to an INTEGER variable.
or of a signed value to a CARDINAL variable. implicitly invokes a conversion function,
which is just an assertion that the value to be assigned is an
element of CARDINAL fl INTEGER. It is the
responsibility of the programmer to ensure that the conversion is valid. In many cases this is not
too difficult, but programmers are urged to avoid mixing signed
and unsigned representations when this is possible. It almost always is.
Mesa does guarantee that LONG T D T for
any type T and
that LONG INTEGER 3 CARDINAL;
thus it is always valid to assign a
short numeric value to a LONG INTEGER variable or a short unsigned value to
a LONG CARDINAL variable.
The properties of conversion to type REAL are not specified by the language.
Some
fine points:
A user supplied
procedure FLOAT is automatically applied to
convert a value from type LONG INTEGER to REAL. Short numeric values
are converted first to LONG INTEGER and then to REAL.
Conversion from a
short numeric value to a LONG INTEGER (and thus to a REAL) is substantially
more efficient when
the value has an unsigned representation.
The conversion of a constant to type
REAL occurs even time the containing expression is evaluated at run-time.
Neither BOOLEAN nor CHARACTER conforms
to any other basic type.
Examples: INTEGER: (valid) |
n: CARDINAL; LONG INTEGER; x: REAL; i 4 0; ii 0: x F n;
x ii; |
(invalid) i x:
11 4- TRUE;
2.5. Expressions
Expressions
are constructs describing rules of computation for evaluating variables and for
generating new values by the application of operators. The overall
syntactic rule for an expression is given by
Expression =
Disjunction AssignmentExpr IfExpr I SelectExpr ...
The Disjunction form includes all the numeric
operations, relational operations, and BOOLEAN (logical) operations and is
discussed in this section. An AssignmentExpr allows one to write multiple
assignments in a single statement and is discussed in section 2.5.4. The IfExpr
and SelectExpr forms are discussed in
chapter 4.
The basic unit from which expressions are built is called
a Primary. This syntactic class includes references to
variables, literals, function calls (chapter 5), and any arbitrary expressions
embedded in parentheses:
Primary =
Variable I Literal I( Expression ) FunctionCall I
Variable = LeftSide
Literal =
numbers BooleanLiteral characterLiteral
FunctionCall =
BuiltinCall I Call
-- defined in chapter 5
Recall that every expression has a well-defined type in
Mesa. The general rules for determining the type of an
expression from the types of its constituent parts are given in section 3.5. In
this section. the types of the basic expression forms (as functions
of the types of their operands) will be outlined. For
example, the type of a Primary is the type of the Variable or Literal
involved• or reduces to the type of the Expression
within parentheses, or is the type of the value returned by the BuiltinCall
(some of which are defined below) or the Call of a
user-defined procedure (section 5.1).
A Primary can be of almost any type; this
is not true of most of the expression forms built up using
Mesa's operators. Some operators are numeric and some are BOOLEAN. The
next sections discuss the numeric operations, the relational
operations, and the operations applicable only to BOOLEAN values.
Considered together, the operators form a single hierarchy with respect to
their precedence, which is
described with each
operator class and summarized in section 2.5.5.
2.5.1. Numeric operators
The
operations on numeric values are addition, subtraction, multiplication,
division, modulus, and arithmetic negation. The syntax for this group
of operations is
Factor =
Primary I — Primary -- negation
Product = Factor I Product MultiplyingOperator
Factor
MultiplyingOperator = *I / I MOD
Sum =
Product I Sum
AddingOperator Product
AddingOperator = + I —
These operators have their usual mathematical meanings.
The division operation on integers, /. always truncates toward
zero: thus —(i/ j)= — i/ j= / — j. The
MOD operator
yields the remainder of dividing one number by another (MOD is not applicable
to REAL operands).
MOD is
defined by the relation (i/j)*j+(i MOD
j) = i, and
the sign of the result of MOD is always the sign of the dividend. (This
is the reason that line 11 of Example 1 takes the absolute value of the
computed gcd: if
m= —12 and n=8 initially, the gcd would be —4 if its absolute
value were not taken.)
The built-in function MIN computes the minimum value in a list of
expressions; similarly, the MAX function, the maximum value. The
built-in function ABS computes
the absolute value of its
argument. The syntax for calls on the built-in functions is
BuiltinCall = MIN [ ExpressionList
MAX [ ExpressionList
ABS [Expression ]
. -- other built-in
functions later
ExpressionList = Expression I ExpressionList , Expression
For the arithmetic operators and built-in functions. the
order in which the operands are evaluated is undefined, but the
syntax implies a precedence ordering that controls the association of operators
with their operands. In that ordering. unary negation precedes the multiplying
operators, which in turn precede the adding operators. Sequences of operators of the same precedence
associate from left to right (with
the exception of the embedded assignment operator, section 2.5.4). Thus, an
expression such as a+b*—c does not
specify the order of evaluation of a. b and c but does require that the
operations be performed in the following order: negate c: then multiply the
result by b; finally,
add that result to the value of a.
Examples: 1. j. k: INTEGER; in, n: CARDINAL; Factors: ri 15 (i+j+ k) —15 ma+, j,
k, —15] Products: /en i/-15 15 n MOD 8 m/
n*10 —
k*(i+ 1)/2 moo 3 Sums: 1+1 —
i+j i n— n MOD 8 in — m/ n*n |
-- same as (m/ n)*10 because of left-associativity -- same as a( —
k)*(i+1))/2) MOD 3 -- same as n—(n MOD 8) because of precedence -- same as m MOD n |
2.5.1.1. Domains
of the numeric operators
In principle, each arithmetic operator designates the
corresponding mathematical function. Unfortunately, the hardware
underlying any implementation of Mesa does not provide this function but
only a set of related partial functions. For each operator, the compiler must
choose as appropriately as possible from this set. The choice is
made by considering the types of the operands.
Example:
With the usual assumptions. 177777B and —1 are represented
by the same bit pattern. The value of 177777B > 0 is TRUE, but that of —1 > 0 is FALSE.
Mesa provides the operators +, *, /, MIN. Max and ABS for all the numeric types. The operation moo
is defined for all numeric types except REAL; the operation of unary negation. for all but CARDINAL and
LONG CARDINAL. For
each of these operators. the type of the result is the same as the
type of the operands. In addition, the result of the
operation is considered to have signed representation if all the
operands have signed representation, and to have unsigned representation if all
the operands have unsigned representation. Thus, adding two INTEGER values
yields an INTEGER result,
and dividing one CARDINAL
by another yields a CARDINAL result.
Some
fine points:
Division and modulus operations on
short numeric values are substantialh more efficient if their operands are unsiened.
Addition. subtraction. and
comparison of lone numeric values are fast: multiplication and division are
done by software and
are relatively slow.
Operations upon REAL values are
implemented as calls on user-supplied procedures. These procedures must be assignable to variables declared
as follows {chapter 5):
FADD, FSUB. FM UL. FDIV: PROCEDURE [REAL, REAL] RETURNS [REAL]:
FCOMP: PROCEDURE [REAL, REAL] RETURNS [INTEGER]:
-- returns a value that is: 0 if equal.
negative if the first is less, positive otherwise
FLOAT: PROCEDURE [LONG
INTEGER] RETURNS [REAL]:
All
other REAL arithmetic operations are fabricated from these primitives.
Although the mathematical
integers (Z) and real numbers are closed under all these operations (except division by zero), the subranges defining the
types INTEGER.,
LONG INTEGER,
CARDINAL and LONG CARDINAL generally are
not. When the result of an operation falls outside the range of its assumed
type, a representational failure called overflow or underflon occurs. In the
current version of Mesa, ii is the programmer's
responsibility to guard against ovollow and undorflow
conditions.
The implications of
Mesa's conventions for subtraction are worth
emphasizing. If both operands have valid signed representations,
the result has a signed
representation. If both have only unsigned representations,
the result has an unsigned
representation and is considered to overflow if
the first operand is less than the
second.
Example:
INTEGER: m, n: CARDINAL;
I m— n: -- should be used
only if it is known that m >= n
i IF n7 > n THEN 277— n ELSE — (n — 1n); -- a safer form (section
3.6)
The arithmetic operations are defined for operands that all have the same
type, but it is possible to mix numeric types (and thus
representations) within an expression.
In this case, operands are convened as
necessary to the "smallest" type to which
all the operands conform, the operation for that
type is applied, and the result also has that type. The rule for expressions involving type REAL is
easy to state:
If any operand has type REAL, the REAL operation is used.
The rules governing
combination of numeric operands with differing representations involve some additional
concepts and are stated in section
3.6. Again, the programmer should try to
avoid such combinations when possible. (Recall that
literals in
INTEGER n CARDINAL have whatever representation
is required by context.)
2.5.1.2. The
operator LONG
The built-in function LONG converts any value with a short numeric type to a long numeric type. A value with an unsigned
representation is converted to LONG CARDINAL; one with a signed
representation, to LONG INTEGER, The
syntax is as follows:
BuiltinCall ,.. I LONG [ Expression ]
This operation is necessary
when the standard conversion rules do
not give the desired result. It can also be used to emphasize
the conversion.
Example:
LON G[ni* n] -- "short" multiplication, overflow lost
LONG[nd*LCNG[n] -- "long" multiplication
Some fine points:
Lenethenima
a sinw.le-precision expression
is substantially more efficient if
that expression has an unsigned
representation.
The Mesa implementation
provides standard procedures (not part of
the languaae) for performing certain multiplication and division operations
in which the operands and results do not all have the same length. These procedures provide less
expensive equivalents of, e.g LONG[nirLONG[rd.
2.5.13. CHARACTER
opera/ors *
Limited
CHARACTER arithmetic
is possible and is sometimes useful for manipulating the encodings of
CHARACTER values.
The following arithmetic operations arc defined for operands of type
CHARACTER:
A CHARACTER value plus or minus a short
numeric value yields a CHARACTER value. Subtracting
two CHARACTER
values yields an INTEGER value.
No other arithmetic
operations on characters arc allowed. Since the results of character arithmetic
depend upon details of the character encoding.
such arithmetic should be used with discretion.
Examples: c: CHARACTER: digit: INTEGER: digit 4- c — -0; c (- 'A + (c—'a) 2.5.2. Relational
operators |
-- assumes c is lower case |
The relational operators include = and #, <, <=
(less than or equal). >= (greater than or equal), >, and
their negatives (e.g., NOT<, —<. —>=, etc.). These
operators always yield BOOLEAN
results, depending on the truth or non-truth of the
relation expressed. The operators = and # apply to most
types: the others, to any ordered type
(i.e.. to any type whose values are considered to be ordered). Ordered types
include INTEGER, LONG
INTEGER, CARDINAL, LONG CARDINAL, REAL, BOOLEAN. CHARACTER (with the ordering given in
appendix C). enumerated types (section 3.1), and subranges
of ordered types (section 3.1).
The relational operators also include the composite
operator IN, which
takes a numeric value as its left operand and an interval as its right operand. Its
value is TRUE if
the left value lies in the interval and FALSE otherwise. The syntax for relational operators is
Sum
I Sum RelationTail
RelationalOperator Sum
Not RelationalOperator Sum I
IN SubRange
Not IN
Su bRange
(1<=
=
— NOT
SubRangeTC --
explained in chapter 3
Interval ... -- explained in chapter 3
Expression ..
Expression ) ( Expression .. Expression ) (
Expression .. Expression ] [ Expression .. Expression ]
The extra syntax for Su bRange and Su bRangeTC is placed
here to be consistent with later uses of the class Interval in
chapter 3. The syntax for intervals follows mathematical notation: a square bracket
indicates the inclusion of the respective end point in the interval, while a
parenthesis indicates its exclusion. For example, the following
intervals all designate the range from —1 to 5 inclusive:
[—I 5] 1-1
.. 6) (-2
.. 6) (-2
.. 5]
In the above examples. –1 is the lower bound of each interval:
the upper bound is
5. The hounds of an interval are its end points, reeardless of
whether the interval is written as a closed. half-open or open one. The hounds
are not required to he constants. An interval with an upper bound less than its
lower is said to be empty: no
values lie in such an interval. For example, the following are all empty
intervals:
–2] [-1 ..
–I) (-2 .. –1) (-2 .. –2]
Examples: Relations: |
n = 15 # n I <= .1 (i < j)
= (1 < k) n IN [1 .. 5) i NOT IN 1— 1 .. 5] |
-- or m –= 11 = with two BOOLEAN operands >= 1 and n < 5 -- only legal if i is signed (because –1 is) |
A fine point:
The relational
operators, like the arithmetic operators, denote families of hardware
operations when they have numeric operands. Anain, there is one operation for each numeric type.
If there is a unique "smallest" type to which all the operands conform,
they are convened to that type as necessary and then the comparison is performed. There is no unambiguous
choice of such a type for numeric operands with different representations:
an attempt to compare two such values is an error. The precise
rules appear in section 3.5.
2.5.3. BOOLEAN operators
The operators NOT (logical
negation), AND and OR apply only to BOOLEAN values. The syntax
is
Negation = Relation I Not Relation
Conjunction = Negation I Conjunction
AND Negation
Disjunction = Conjunction I Disjunction OR Conjunction
NOT ?legates the logical value of a BOOLEAN expression.
p AND q has the value TRUE if
and only if both p and q are TRUE. p OR q is TRUE if
at least one of p or
q is TRUE.
When evaluating a Boolean expression. evaluation
of primaries is guaranteed to take place from left to
right. In the operation AND or OR, the second operand is evaluated only if the first
operand's value does not determine the value
of the expression.
A
fine point:
"x AND y" is equivalent to the IfExpr
"IF .v THEN v ELSE FALSE": i.e., when x is FALSE. y is not o aluated.
"x OR y" is equivalent to the
IfExpr ''IF x THEN TRUE ELSE y'': i.e.. when x is TRUE. y is not evaluated.
It is therefore safe
to have expressions of the form "x AND y". where y is defined only
when x is TRUE. e.g.. "x#0 AND c/ x > 2". or "p=NIL OR p1=0".
Examples:
Negations: NOT 1=15 -- same as
NOT(i=15)
–q q must be of type BOOLEAN
–(p AND q)
Conjunctions: 1 <= j AND j < k
p AND — g
i=5 AND j NOT IN [— 1..1]
Disjunctions: in>n OR in =15
–p OR –q
2.5.4. Assignment
expressions
The assignment operation can be embedded in other
expression forms. When it is. the result of the operation has the
type of the LeftSide and
the value received by the LeftSide in
the assignment. The "+-" operator has the lowest precedence
of any operator. Its syntax is the same as that of the
AssignmentStmt:
AssignmentExpr = LeftSide (- RightSide
If this form is used to perform multiple
assignments. it is important to note that "4-"
is right-associative. Thus. the
assignment expression a<--b4-b+1 first
assigns the value of b+1 to
b and
then assigns b's new value to a.
Examples: Assignment |
Expressions: m4-15 m4-n*-15 tw-n4-n+1 --
same
as m4-(n4-(n+1)) Hj+-(j+1) MOD n)*2 -- all these parentheses are necessary |
Rules governing assignments of numeric values when the
types are not identical are summarized in section 2.4.6.
Fine
point:
Because the order of
evaluation of the primaries is not defined, expressons such as "(i< j) + (p- kF have unpredictable values
and should not be used.
2.5.5. Operator
precedence
The following table summarizes the precedences of the
unary and binary operators introduced in this section. The order
is from highest precedence (tightest binding of operands) to lowest; operators
on the same line have the same precedence.
-- unary negation
*, MOD
+, — --
subtraction
=, #, <=, >,
>=, IN
—, NOT
AND
OR
4-
Parentheses can be used to explicitly control the
association of operands with operators.
2.6. Initializing
variables in declarations
A
variable may be given an initial value in a declaration. For example, the
Boolean variable delimited could
be set initially FALSE
by using the declaration:
delimited: BOOLEAN 4-
FALSE:
Variables (of the same type) can be initialized
collectively:
11. nO: INTEGER 4- —7:
This declares two separate integer variables n and 110 and initializes each to —7.
Any
expression that could be used as the Rig htSide of an assignment can be
used to initialize a variable:
1: INTEGER 4- ABS[n]; -- this will set i
to 7
i.S'quared: INTEGER 4-
14 -- Squared is initialized to 49
j: INTEGER 4- i.S'quared— i+1: j
is initialized to 49— 7+ 1 = 43
All initializations shown so far have taken
"assignment" (or "4-") form. There is
another form. the "fixed" (or "=")
initialization. For
example.
octalRadix: INTEGER = 8;
This means that octalRadix is to have a fixed value. It is never valid as the LeftSide
of an assignment. We call octalRadix a constant because its value can never
change after it is initialized (recall that the number 8 is
called a literal). Normally.
the term "constant" will include the term "literal";
if the distinction is important, then "literal" will be used.
Initial values for fixed
initialization can be arbitrary expressions. Paraphrasing the earlier example:
i0: INTEGER = ABS[— octalRadix]: iOSquared: INTEGER = 10* i0: j0:
INTEGER = iOSquared— i0+1;
The initializing expression can use values that
are not known at compile time. In this example, if octalRadix did not have fixed
initialization, the values of la
iOSquared, and j0
would be computed and assigned at run-time.
Variables are initialized in the order of appearance in a declaration, and later
declarations can use variables initialized earlier, as shown by the example.
2.6.1. Compile-time constants
Wherever possible, the Mesa compiler evaluates expressions
containing only constants. If a variable is initialized
using the fixed form and the expression can be evaluated at compile time, then
that variable has a known value. Since it can never appear as
the LeftSide of an assignment operator, it too becomes a
compile-time constant (the variables i0, i0Squared. and JO
in the previous section are all
compile-time constants).
Example:
beta: INTEGER = 3;
alpha: INTEGER = beta— 1:
In this case, alpha
is a compile-time constant (with the value 2), since the
expression beta-1 involves
only compile-time constants. Compile-time constants need not occupy memory at
run-time; the compiler can replace references to compile-time constants, such
as alpha and
beta, by
their known values.
Some
fine points:
Knowledge of compile-time constant
values can also be exploited when analyzing expressions. processing other declarations. or generating
object code.
One side effect of this propagation of constants is that
the representation of a numeric constant is known at compile-time. For
instance. alpha above is declared to be an INTEGER. but
because its value is 2. it may also be used as a CARDINAL. However, declaring
the type of alpha determines what kind of arithmetic
(signed or unsiened) will be used to compute its value. whether at compile-time or
run-time (section 2.5.1).
In certain contexts. an expression is required
to yield a compile-time constant value. A (sub)expression denotes such a constant
if all the operands are compile-time constants and the operation is not one of
those listed below (current restrictions):
Conversion of
a numeric value to type REAL.
Any
arithmetic or relational operation with operands of type LONG INTEGER. LONG
CARDINAL or REAL.
Application of an function (chapter 5) other than a built-in
function.
The @ operation (section 3.4).
The SELECT operation
(section 4.3.31.
2.7. More general declarations
Preceding sections have introduced all the syntactic
components of a declaration. The general form is defined as follows:
Declaration :: = IdList :
TypeSpecification Initialization
For the moment, TypeSpecification is defined as one
of the built-in types: chapter 3 describes other forms of TypeSpecification.
TypeSpecification ::
= PredefinedType I .
PredefinedType =
INTEGER
I CARDINAL I
BOOLEAN I CHARACTER
LONG
INTEGER I LONG CARDINAL I REAL I
STRING I --
see chapter 6
WORD I -- see fine point below
UNSPECIFIED --
see fine point below
An Initialization is formally defined as follows:
Initialization :: = empty
<-
Expression
=
Expression
--other forms are given later
Fine
points:
The
predefined type WORD is provided to describe values on which bit-by-bit loaical
operations are to be performed. Currently. it is a synonym for CARDINAL.
The
predefined type UNSPECIFIED is a device for by-passing most type
checking. An UNSPECIFIED value is a single machine word. and it matches the type of any
object that occupies at most a single machine word. including INTEGER.
CARDINAL. CHARACTER. BOOLEAN. UNSPECIFIED, STRING. and any user-defined type (chapter 3) that fits in a single
machine word.
For numeric operations.
its representation is similarly fluid. If a CARDINAL and an UNSPECIFIED value
are the operands of
some arithmetic operation. then the UNSPECIFIED value is considered to be unsigned.
If an UNSPECIFIED is combined with a
signed value. it is treated as if it were signed too. If an UNSPECIFIED is
combined with an UNSPECIFIED. they are both treated as siened.
Less type checking is sacrificed by
using LOOPHOLE (section 3.5.1) than by declaring variables with type
UNSPECIFIED.
CHAPTER 3.
COMMON CONSTRUCTED DATA TYPES
Mesa encourages the programmer to augment the
collection of predefined types by constructing
new types. Types can be defined to describe objects
that are structured collections of related values (e.g., a
vector of Booleans, a table, or a complex number consisting of real and
imaginary components). Mesa's type system has other, perhaps less
obvious applications. These include expressing some of the
programmer's knowledge about a class of variables (e.g.,
that all take on values restricted to some known interval),
separating variables with different semantics into different classes so that
they cannot be confused (e.g., to avoid "comparing apples and
oranges"), and hiding implementation details of abstractions
(e.g., to prevent the user of a table-lookup package from depending upon the internal
organization of the table).
Programmer-created types have the same status as Mesa's
built-in types. They can be used to declare variables and to
construct further new types. In addition, values of most constructed types can
be operands of the fundamental operations (4-, =, #).
A new type identifier is declared using the
following syntax:
TypeDecla ration :: = idList
: TYPE = TypeSpecification
Each identifier in the idList is thereby declared
to name the type denoted by the TypeSpecification. If
this declaration form is compared to a normal declaration, i.e.,
Declaration = IdList : TypeSpecification Initialization
:
it can be seen that "TYPE.'
fills the role of a TypeSpecification, and "= TypeSpecification" plays
the role of Initialization.
In fact, the newly declared identifier has type "TYPE" and
a value (which must be constant, hence the "=") that is
a TypeSpecification.
Any
predefined Mesa type (section 2.7) is a valid TypeSpecification: thus the following
are valid type declarations:
SignedNumber: TYPE = INTEGER;
UnsignedNumber: TYPE = CARDINAL;
Truth Value: TYPE =
BOOLEAN:
Char: TYPE = CHARACTER;
These type identifiers are now
valid type specifications
and can be used to
declare variables:
j: SignedNumber, n: UnsignedNumber,
b:
Truth Value:
c:
Char;
After
this series of declarations, i and
j have
type SignedNumber, which
is equivalent to INTEGER:
n has
type UnsignedNumber, which
is equivalent to CARDINAL;
etc. This is a trivial way of defining
new types. A more interesting way uses a type constructor as
the TypeSpecification and generates a truly new type. not just an
additional name for an existing one. A TypeSpecification can
be defined as
TypeSpecification :: = PredefinedType
Typeldentifier
TypeConstructor
(TYPE
itself is not
a TypeSpecification: it can be used only to declare
types.)
There
is an important point worth emphasizing here. A TypeSpecification that is a PredefinedType
or a Typeldentifier denotes an existing type and yields
the same type every time it is used. A declaration such as the one of
SignedNumber introduces
a synonym for the name of an existing type. Synonyms can be more
descriptive and thus improve readability, but they do not partition the set of values.
The types SignedNumber and
INTEGER are
fully equivalent, and values with these types can be used
interchangably. On the other hand, a TypeConstructor constructs
a new type. The rules for equivalence and conformance of constructed types
depend upon the forms of their constructors and are discussed as
the constructors are introduced. In some cases, each appearance of a
constructor generates a unique type, i.e.. writing the same sequence of symbols
twice generates two distinct, incompatible types. For this reason, programmers
usually should name such a type, using a TypeDeclaration, and
thereafter use the type's identifier. Of course, introducing an
identifier for a constructed type can make a program easier to read and modify
in any case.
described in chapter 2 (except for STRING in
chapter 6 and process related The simplest form of a Typeldentifier
is given by
= identifier I -- which is a declared type
• • • -- other forms
given in chapters 6 and 7
The
rest of this chapter discusses the attributes and uses of some common
constructed types: enumerations,
subranges, arrays, records, and
pointers. The
syntax for TypeConstructor is
TypeConstructor
= EnumerationTC
Sub
rangeTC A rrayTC
RecordTC PointerTC LongTC
Procedu reTC
A rrayDescriptorTC RelativeTC SignaITC
PortTC
ProcessTC
--
for enumerations -- for subranges
-- for arrays
-- for records
-- for pointers
--
for long
pointers, etc
-- see chapter 5 -- see
chapter 6 -- see chapter 6 -- see chapter 8 -- see
chapter 9 -- see chapter 10
(The suffix "TC" is to be understood as an
abbreviation for "TypeConstructor''.)
Enumerations
define a set of values by giving a list of identifiers members of an ordered set. |
. These identifiers can be viewed as |
Subranges define types with values drawn from those of a
larger, encompassing type but restricted to lie in a specified
interval. The subrange takes on the characteristics of the enclosing type: for example.
a subrange of INTEGER can
be used to declare
variables that behave as INTEGERS but are constrained to take values
within some interval.
Arrays arc sequences of components that arc homogeneous
with respect to type and are accessed by computed indices
("subscripting"). Records are sequences of components that have
potentially different types
and are accessed using fixed component names ("selection"). Records
and arrays are Mesa's aggregate data types.
Pointers arc scalar values used to access data objects
indirectly. A pointer value is represented by an address. Pointers
can be used to build linked lists, tree structures, etc. Long pointers are
pointers capable of spanning a larger address space than ordinary
pointers.
Chapter 3 concludes with a discussion of type determination. the
process by which Mesa decides whether an expression has an
acceptable type for a given operation. This is closely related to questions
of the equivalence and conformance of types.
3.1. The element types
This
section describes a class of types called element types. Their common properties are the following:
(1) They
are ordered types; values of an element type can be operands of all the
relational
operators (section 2.5.2).
(2) They
are scalar types;
a value with an element type does not have any visible or directly
accessible internal structure insofar as the language is
concerned.
(3) They
can be used to declare subrange types (section 3.1.2).
(4)
They are the only types allowed as index types
of arrays (section 3.2). The element types are INTEGER, CARDINAL, CHARACTER. BOOLEAN. the
types generated by
EnumerationTC, and
the types generated by SubrangeTC. Because of (3) above, this definition is
recursive; subranges of subranges are allowed. The definition of the class ElementType is
ElementType = INTEGER I CARDINAL I CHARACTER I BOOLEAN
EnumerationTC Sub rangeTC
A fine
point:
Note that LONG INTEGER and LONG
CARDINAL. althoueh ordered scalar types, are not element types. It is not possible to declare subranges
of these apes or to use lone numeric values as array indices.
3.1.1 Enumerated types
Consider the following declarations and a typical
assignment:
channel.S'tate: INTEGER;
disconnected:
INTEGER =
0;
busy: INTEGER = 1;
available:
INTEGER = 2;
channelS tale f-
busy:
Suppose channelState
is a variable that is intended to range over a set of
three "states" named disconnected,
busy, and available,
which are represented by values 0, 1, and 2. These values
have no real significance; 5, 6, and 7 would serve equally well.
Enumerated types are well suited to such an application (where
the underlying values are unimportant). The above declarations could be replaced
by a single declaration of a variable with an enumerated range:
channelState: {disconnected, busy, available}: channelState - busy:
The effect is the same as before: channelState is a variable with
values ranging over the same "states". and
similar assignment statements
can be used.
The enumeration has some advantages over the original
declarations:
It
is more convenient: the programmer does not have to provide values for disconnected, busy, and available.
It
allows more type checking. In the INTEGER case, one could assign any
short numeric value to channelState.
It helps documentation; an enumeration
shows all of its possible values.
An enumerated type is constructed by specifying a
list of identifiers between braces, "{...}". These identifiers
are not variables, but constants of that enumeration called identifier constants. They
represent nothing more than their own names.
The type constructor EnumerationTC
is defined
as follows: EnumerationTC =
{ IdList }
The IdList supplies all the identifier constants
for the enumeration, and duplication of identifiers is illegal.
Separately specified enumerations
are distinct. Every appearance of an EnumerationTC generates
a new type that is not equivalent to, and does not conform to. any other
enumeration. Thus the declarations
foreground: {red, orange, yellow,
green, blue, violet background: {red,
orange, yellow, green, blue,
violet
specify two different enumerations. It is illegal to assign background to foreground, despite the fact that
the same identifier list appears in each declaration. Occasionally, the
inability to declare any further variables with the same type can be used
to advantage by the programmer. Otherwise, the best way to avoid
such problems is first to declare a type and then to declare variables using
the identifier of that type; for example:
Color: TYPE = {red, orange, yellow, green. blue. violet foreground:
Color,
background: Color:
This
allows the assignment of background
to foreground as
well as the declaration of further variables with the same type
(perhaps initialized differently).
The identifier constants in two different enumerated
types have no association whatsoever and do not need to be
distinct from one another. To identify unambiguously the enumeration from which
a constant is taken, one can, and sometimes must, qualify' the
identifier constant by the name of the enumerated type. For example, eiven the
additional declaration
Fruit: TYPE = {orange. lemon};
Color[orange] denotes
a color and Fruit[orange] denotes
a fruit. More generally, the syntax used for this form of
qualification is
Primary = I Typeldentifier
[ identifier ]
(This
adds a new case to the syntactic definition of Primary,
which already allows an identifier constant.)
Often qualification is not necessary: for instance, the following is permitted:
hue:
Color;
hue orange: -- the
type of hue implies
Color[orange]
In the following situations, an identifier constant need not be qualified.
because the intended
enumerated type is established by the context:
as the RightSide of an assignment
as an initializing Expression
a component in an array or record constructor (sections
3.2.2 and 3.3.4)
--_-_ I as an argument of a procedure (chapter 5)
as an array index (section 3.2)
as
the right operand of a Relation, including that part of a Relation used
to label an arm in a discrimination (section 4.3)
as the bounds in a Su b rangeTC (section 3.1.2)
The values of an enumeration are ordered. The
ordering is given by the order of appearance in the IdList used to construct
the enumerated type. The leftmost identifier has the smallest value, and values
increase from left to right. The following relations all have the value TRUE:
Color[red] < Color[orange]
Color[red] < violet
hue IN [red .. -- assuming
hue = orange
There are two additional built-in functions that are applicable to
enumerations:
FIRST [TypeSpecification]
yields the smallest value of the specified enumeration;
e.g.,
FIRST [Coloij= red. Similarly, LAST [TypeSpecification] produces
the greatest value in an
enumeration;
e.g., LAST [Color],
violet. It
is also possible to iterate over all values of an enumeration
(section 4.5).
The predefined type BOOLEAN is really an enumerated
type, and its definition is
BOOLEAN: TYPE =
{FALSE, TRUE}:
Thus,
FALSE<TRUE, FIRST
[BOOLEAN] = FALSE, and LAST [BOOLEAN] = TRUE. Note, however, that
the BOOLEAN constants
TRUE and
FALSE may
always be used without qualification.
3.1.2. Subrange types
In many cases. the values of a variable are inherently
range-limited. For instance, a value for day (of the month) lies in the range [1..31]. In
other cases, the range is limited by design. For instance, a
value for year might
be limited to the range [1900..1999]. Mesa permits the user to declare such variables
in the following way:
day: CARDINAL [1 .. 31];
year: CARDINAL [1900 .. 1999];
Since these intervals cover a subrange of CARDINAL, the
variables day and
year are called
subrange variables. It is useful to think of day and year as having type CARDINAL with
the additional constraint that values are restricted to the specified intervals.
Subrange types have a
number of advantages and uses. Subrange declarations unambiguously document
the range of values intended for a variable and thus aid software maintenance.
The compiler is able to optimize storage allocation when
dealing with range-restricted variables (for example. in
arranging the fields of a record. section 3.3) and can take advantage of
subrange declarations to generate more efficient object code.
The
general form of a Su brangeTC is
Su brangeTC = Typeldentifier Interval I
Interval
The Typeldentifier must evaluate to an ElementType. Thus,
one can declare types that are
subranges
of INTEGER,
CARDINAL. CHARACTER, BOOLEAN, enumerated types, and other
subrange types. For example,
SymmetricRange: TYPE = INTEGER [—
PositiveInteger• TYPE =
CARDINAL [1..LAST [INTEGER]];
UpperCaseLetter: TYPE =
CHARACTER ['A..'Z];
DegenerciteTjpe: TYPE =
BOOLEAN [TRUE..TRUE]:
CoolC olor: TYPE = C oloityellow..LAsT
[Color]]: -- excludes
red orange, yellow
AthroughM: TYPE = UpperCaseLetter['
A..'M]; subrange
of a subrange
The
base type for
a subrange is that type of which it is a subrange and which is not itself a subrange;
e.g., the base type for both UpperCaseLetter
and Athroughill
is CHARACTER.
The Expressions
that define the end points of an interval must have types
that conform to the type denoted by the Typeldentifier (or yield short numeric
values if the identifier is omitted). Also, for the purpose of defining a subrange type, the
end points must be compile-time constants.
A fine point:
It is permissable for
the interval defining a subrange type to be empty. It is not legal
to use a variable of such a type. but an empty subrange is sometimes useful for
specifying the bounds of an array in a record declaration (section 3.2).
A subrange type conforms to its base type, and a base
type conforms to any of its subrange types. By extension, any two subrange
types with the same base types are mutually conforming (even if they
do not overlap in any way). A more revealing point of view is that the value of
a subrange variable has the base type as its type. and an assignment
of a value to a subrange variable makes an associated
assertion that the value is in the appropriate interval. A violation of such an
assertion is called a range error. It is the programmer's responsibility to guard against
range errors. As implied by this viewpoint,
appropriate literals of the base type serve as literals of the subrange type,
and any operations defined on the base type automatically extend to the
subrange type (but usually without closure).
Examples:
n: CARDINAL [0..10]; in: INTEGER
[— 5-5]:
in 0; n <- 0; -- inherited literals
n F n+1; --
not valid if n = 10
17 4"
in; -- only valid if m IN [0..5]
The preceding discussion implies that subrange
restrictions can be ignored in answering many type-related
questions; in this sense. subrange types are "weak." Two subrange
types are equivalent if their base types are equivalent and if the
corresponding bounds are equal. For these types, equivalence is much
stronger than conformance. Equivalence becomes important when subrange types
are used in the construction of other types.
FIRST
and LAST are applicable to all subrange types and yield the
corresponding bound. For example. FIRST [CoolColod=
green and LAST [Athroughill]='M. It
is also possible to iterate over all values in a subrange (section 4.5).
A
fine point:
The operators FIRST and
LAST are applicable to all element apes. including INTEGER. CARDINAL and
CHARACTER_ as well as LONG INTEGER and LONG CARDINAL. When applied to the
numeric apes. they supply
information about the range of ialues supported by a particular implementation.
3.1.2.1. Subranges of numeric types
*
The description above applies to subranges of both
enumerated and numeric types. Numeric subranges introduce one
further complication, which is the question of representation. Omission of the
initial Typeldentifier in a SubrangeTC
is permissable if and only if each bound in the Interval specifies a short numeric
value. In that case, INTEGER
or CARDINAL is the base type, and the
choice depends upon the representations of the bounds.
A numeric subrange type has
a signed representation if both bounds are elements of INTEGER and
at least one is not an element of INTEGER n
CARDINAL. Similarly, it has an unsigned representation if both
bounds are elements of CARDINAL and at least one is not an
element of INTEGER n CARDINAL. If
both bounds are elements of INTEGER n CARDINAL, values of that
subrange type are considered to have both representations. Any other combination of bounds is
illegal.
Examples:
sl : [— 10..10]; -- signed representation
s2: [100..33000]; -- unsigned representation (if 33000
> LAST [INTEGER])
s3:
[0..10); -- both representations
With respect to the choice of signed or unsigned versions
of arithmetic. and relational operators, a quantity with both representations
is treated flexibly. When combined with an unsigned value. it is considered
to be unsigned; the unsigned operation and result are chosen. When it is
combined with a signed value, the operation and result are
signed. The rules governing combinations of values with both
representations depend upon the context in which the result is used; the
default is to choose signed representation and INTEGER operations.
The precise rules are discussed in Section 3.6.
Examples:
i: INTEGER; n: CARDINAL; -- plus
the declarations above
(signed) (unsigned) |
sl + 1 sl + s3 s3— i s2
+ 1 s2
+ s3 s3
* n |
A
fine point:
The representation assumed for a
literal also depends upon context. In fact, any short numeric constant c is treated as if its type were [c..d.
Range assertions *
Assignment to a subrange variable implies an assertion
about the range of the expression being assigned. The programmer
may make such an assertion explicitly, for any expression, by using a range assertion. If S is an identifier of a
subrange type and e is
an expression with a type T conforming
to 5, the
Primary S[e] has the same value as e and is additionally an
assertion that e
IN [FIRST rsn LAsT[sn 7]] is TRUE. In
addition to user defined types, the basic types INTEGER
and CARDINAL may
be used in range
assertions.
.4 program that violates one of its range assertions is in error. In
addition to providing
documentation and (optional) run-time checking, a subrange
assertion affects the attributes attached
to an expression. For example, an assertion of an INTEGER range
(or a signed subrange) forces the result to be treated as a
value with signed representation. This is useful for controlling the choice of an
operation when the intended one cannot correctly be inferred from the operands
(section 3.6).
Examples:
1.- INTEGER; n: CARDINAL; S: TYPE = [0..10];
CARDINAL
[i] i is asserted to be
nonnegative
Sin] --
asserts n IN [0..10]
3.2. Arrays
Arrays are indexable collections of homogeneous
components. In other words, the components of a given array all
have the same type, and each corresponds to one index value in a range of
indices associated with that array. The range of indices (which is actually a
type called the index type) and
the component
type determine the array type. For example:
earningsPerQuarter: ARRAY [1..4]0F
INTEGER;
declares a variable with a constructed array type having
an index type of [1..4] and a component type of INTEGER. Thus,
earningsPerQuarter is
an array of four integer elements: earningsPerQuarter[1],
earningsPerQuarter[2], ,
earningsPerQuarter[4]. earningsPerQuarter by itself refers to
the entire array variable. (Aggregate variables and components of
aggregates are generally called "variables". If
a distinction is needed, the term component
is used and always means an item contained within an
aggregate.)
An index type must be an element type (other than INTEGER or
CARDINAL). A
one-to-one correspondence exists between the components of an array
and the values of the index type. This allows array elements to be
accessed via "indexed references". An indexed reference selects
and accesses the component corresponding to a particular index
value. In its simplest form, it consists of the name of an
array followed by a bracketed Expression
with a type conforming to the array's index
type.
An index type can be specified using a
type identifier:
Quarter: TYPE = [1..4];
profit,
loss, earnings: ARRAY Quarter OF INTEGER: thisQuarter: Quarter:
earnings [thisQuarter] 4- profit [thisQuarter] —
loss [thisQuarter];
The arrays profit,
loss, and earnings have
Quarter as
their index types, and thisQuarter is
a subrange variable with type Quarter.
Index types may
also be enumerations or subrangcs thereof. For example,
CallType: TYPE = {longDistance.
tiel.ine. toll, local, inPlant}: nearbyCalls: ARRAY CallType[toll..inPland OF CARDINAL:
nearbt•Calls[locall.-
nearbyCalls[local+ 1:
Components
may be of any desired type. In particular. the component type may itself be an
array type. This allows an approximation of multidimensional
arrays, which are otherwise absent in Mesa. For example, a
two-dimensional data structure can he declared and used as follows:
Matrix3by4; TYPE = ARRAY [1..3] OF ARRAY [1..41
OF
INTEGER:
mxy:
nog [3]
[4] 4- 0; -- clear last component.
In
the assignment statement, tnxy is
an expression of array type (with index type [1..3] and component
type ARRAY [1..4] OF
INTEGER). wxy[3] is
an indexed reference to the third component of mxy. This in turn yields an
expression of array type (with index type [1..4] and component type INTEGER). Thus,
nay [3] [4] is an indexed reference to the fourth component of that subarray. Because of
left-associativity, mxy [3] [4] is the same as (mxy [3]) [4].
An array
constructor consists of an optional type identifier followed
by a list of values (syntactically, Expressions)
enclosed in brackets. The list specifies values for
components of an array in index order. The declaration below
uses an array constructor to initialize an array that can be used as a translation
table: i.e., octalChar[n] holds the
character denoting octal digit n:
octalChar: ARRAY [0..7] OF
CHARACTER = ['0, '1, '2, '3, '4, '5, '6, '7];
Note that the number of values in the list (eight) matches
the number of indices in the index type. This is required for array constructors. A
special form using the replicator ALL is
available for abbreviating array constructors in which all
components have the same value. For example, the following two
declarations are equivalent:
dashes: ARRAY [0..7] OF CHARACTER 4- •-,
dashes: ARRAY [0..7] OF CHARACTER 4- ALL [.-]:
Array variables may also be
initialized using other array values. Consider the following example:
freshVector. ARRAY [0..3) OF CARDINAL = ALL [0]: current Vector: ARRAY [0..3) OF CARDINAL 4- fresh
Vector,
In this case, current
Vector is initialized with fresh Vector's value, i.e., all three of current Vectors elements are
initially set
to zero.
When the operands of any of the fundamental operations (4-,=.1t) are
arrays, the operation is applied on a component-by-component basis. The
initialization of current Vector above
uses assignment in this way. Similarly, the expression "current Vector = fresh Vector" yields
the result TRUE if
and only if all three components of each array are equal (as they are in the
above example). Because the declaration of fresh Vector uses fixed
initialization, assignment either to the entire array or
to one of its elements is illegal.
3.2.1. Dechwatiou of arrays
Arrays are declared using the array type constructor. A rrayTC:
A rrayTC PackingOption
IndexType ComponentType
PackingOption ARRAY IndexType OF ComponentType
empty I -- elements
word aligned
PACKED --
elements potentially packed within words
ElementType Typeldentifier
TypeSpecification
Two array types are equivalent if both their index types
and their component types are equivalent and if they are both packed
or both unpacked (see below). An array type conforms to another if the two
types are equivalent. Thus it is possible to assign or compare array variables
with separately constructed types if those types are
structurally identical (see the assignment to currentVector
above).
A fine
point:
In additon. one array type freely
conforms to another if the component type of the first freely conforms to that of the second, the index types are
equivalent, and they are both packed or both unpacked (see section 3.5).
Declarations of initialized array variables take the form
IdList : ArrayTC
Initialization
The initializing expression must have an array type conforming to the one
being declared.
The previous section describes indexed references to array
components. A formal definition follows:
IndexedReference Variable
I Expression I I
( Expression ) I Expression
I
LeftSide IndexedReference
The Variable or
the parenthesized Expression must be of some array type, and the bracketed Expression must
conform to the index type for that array type. An IndexedReference is
itself part of the definition of a Leftside (and
therefore of a Variable, section 2.5).
Some
fine points:
Unless an array is
packed, each component is "aligned". i.e.. begins on a word boundary.
Currently, a byte is the smallest unit into which the elements are packed. Thus
a packed array of CHARACTER wastes no space. but a packed array of BOOLEAN has considerable
overhead.
Since packed array
elements are not necessarily word aligned. one cannot use the C
operator (section 3.4) to generate the address of
an clement.
The length
of an array is
the number of its elements. For variables with an array type. the length
is fixed and known at
compile-time. (Dynamic arrays are possible in Mesa through the use of array
descriptors, discussed in section 6.2.1.)
The IndexType of an array may legally be an empty
interval. In this case, no storm is allocated for the array. This is useful when the
array appears as the last component of a MACHINE DEPENDENT RECORD (section 3.3) and the user will be
obtaining storage for each record plus some number of array elements
from a free storage manager. Note that [0..0) is not equivalent to [1..1).
since the intervals specify different initial indices for the array.
Three function-like operators are
relevant to arrays (and more relevant to array descriptors): LENGTH. BASE, and DESCRIPTOR.
These are discussed in section 6.2. but a brief summary is provided below. For
this summary
. mg denotes an expression with
some array type.
LENGTH [arg] -- yields the number of array
elements.
BASE [arg] yields a pointer value for locating
the first array elemenL
DESCRIPTOR [arg] yields
argis array descriptor value (consisting of base
and length).
constructors are used only for initialization.
Actually, constructors RightSide context. An array
constructor is defined as follows:
Constructor I ...
OptionalTypeld
[ ComponentList ] I ALL [ Component ] Typeldentifier I empty
PositionalComponentList
-- other forms for record constructors
=
Component
PositionalComponentList
, Component
=
empty I --
elided component
Expression I
NULL
The empty components in a constructor are said to be elided, and NULL components are said
to be voided. The
values of both elided and voided components are undefined. In the first form of
array constructor, the number of Expressions plus elided
or voided components must match the length implied by the
array type. The type of each Expression must conform to the array's
component type. The expressions (and elided or voided components)
are taken in order to form a sequence that is the constructed
array value.
Consider the following example:
Triple: TYPE = ARRAY [1..3]
OF CARDINAL;
triplet: Triple 4- Triple [11,
12, 13];
The declaration assigns 11 to triplet [1], 12 to triplet [2] and 13 to triplet [3].
When
the array type is implied by context, the Typeldentifier may be omitted
(see the discussion of record constructors, section 3.3.4). Thus
the declaration above could be written as
triplet: Triple 4- [11, 12, 13]:
Taken
out of context, the constructor [11, 12. 13] is ambiguous: it could be assigned
to any array
of three numeric elements: for example:
trio: ARRAY {Patt}, Laverne. Maxine} OF LONG INTEGER +- [11, 12, 13];
The second form of constructor, using ALL, is
only valid
when the array type is implied by context. The type of the Expression
must conform to the array's component type. The value of the constructor
is an array in which the specified value is replicated a number of times equal
to the length of the array. The expression is evaluated just
once. In the case of an array of arrays, the structure must be
mirrored by nesting in the constructor, as shown by the following example:
allOnes: Alatrix3by4 4-
ALL [ALL [1]]:
Some
fine points:
The value of an elided or voided
component of an array constructor is not defined, but it will have some value.
In
particular. if the statement
triplet +- [1. , 3]:
is
executed after the previous assignment to triplet. the value of triplet [2] is undefined.
Any array constructor
in which all components are compile-time constants is a compile-time constant_
Also, selection from
an array that is a compile-time constant using a constant index yields a
compile-time constant.
3.3. Records
A record
is an aggregate that allows a group of related data items
of different types to be packaged together. In the definition
of a record type, the type of each individual component must be supplied.
as in the following example:
MilitaryTime: TYPE = RECORD [hrs: [0..24). [0..60)1:
oldTime. newTime:
Here„IlditwyTime is a newly defined type, and oldTime and neit•ime are
record variables of that type. IlilitaryTime is a two-component record type, where the first
record component is named hrs and
the second rains. Every AlilitaryTime
record contains both components, but different record objects
have their own values for these components.
A constructor of a record type contains a field list after the word RECORD. Each
element in the list specifies one (or more) components of the record.
For Militaryrime, the field list is [hrs:
[0..24), mins: [0..604
The component names, hrs and
rains, are
called field names. They
are used to refer to components in any Illilitar•Time record.
For instance, the first component of oldTime may be selected using the qualified reference, "oldTime.hrs".
One can construct an entire record value using a record constructor. For
instance, the constructors below yield MilitafryTime values with hrs components that have the value 13 and mins components that
have the value of the expression "y+1":
AfilitaryTime[13, y+ 1] AlilitagTime[hrs: 13,
mins: y+
1]
The
second constructor is an example of a keyword constructor, since it specifies the name
of the component (e.g., as "hrs:") with which a value
is to be associated.
A default value can be specified for any field in the
definition of a record type. The default is used in constructing records of
that type when no value is specified in the constructor. Defaults are useful
for suppressing detail and ensuring initialization of fields. In the following
example, the two constructors have the same value:
Datum: TYPE = RECORD
value: INTEGER, nReads: CARDINAL nWrites: CARDINAL 1:
Datum [x]
Datum [value:
x, nReads: 0, nWrites: 1]
The basic operations on (non-variant) record
values include the fundamental operations (=, #, 4-), as
well as qualification and
extraction for accessing the
record's components.
3.3.1. Field
lists
There are two kinds of field lists, depending on whether
the fields are "named" or "unnamed". (Field
lists used to construct multi-component
record types are almost always named).
Syntax equations:
FieldList = [
UnnamedFieldList [ NamedFieldList
UnnamedFieldList = TypeSpecification
TypeSpecification UnnamedFieldList
NamedFieldList
FieldDescription
DefaultOption
Examples:
[i:
INTEGER, b: BOOLEAN, c: CHARACTER] [INTEGER. BOOLEAN,
CHARACTER]
[17:
CHARACTER, f2.
f3: INTEGER]
[fl: CHARACTER, f2: INTEGER. fl:
INTEGER]
a named field
list
a similar, but unnamed field list components listed and
declared together equivalent to the above
Note that if one field is named, all must be named.
Also, field names must be unique within a given field list. (The same
identifiers may be used as field names in other field lists, however, or as names
of declared variables.)
Field
descriptions in a named field list contain a type specification, indicating the
type of the field. Any type may be specified, including an array
type or (some other) record type.
Some
fine points:
A field's type specification must not
imply an infinite nesting of records. For instance, the following is illegal:
A:
TYPE = RECORD
[b: B]:
B:
TYPE = RECORD
[a: A]:
Field lists occur in
constructors of types other than records, such as PROCEDUREs (chapter 5).
SIGNALs (chapter 8).
and in variant record
specifications (chapter 6).
Unnamed field lists are
normally used when component names would be ignored if they were present. This
is common for
functions that return single-component results. "Unnamed field lists are
sometimes used in specifying
the input parameters for procedure variables that are to be set to one of
several actual procedures. (However, an unnamed field list does not allow Calls using such a
procedure variable to refer to the parameters by name.)
3.3.2. Declaration of
records
The type constructor RecordTC is defined as
follows:
RecordTC = RECORD FieldList
-- plus variant records
(chapter 6)
where FieldList is
defined in the previous section. Separately declared record types are unique, even
if they look the same. Every
appearance of a record constructor creates a new type that is not
equivalent to, and does not conform to, any other record type. In
the example:
ReCT-Jpei:
TYPE RECORD [a,b: INTEGER];
reel: RecTypel;
RecType2: TYPE RECORD [a,b: INTEGER]:
rec2:
RecType2;
rec3:
RECORD [a,b: INTEGER];
rec4: RECORD [a,b: INTEGER];
the record variables reel, rec2, rec3, and
rec4 all have different, non-conforming types. None of these
can be assigned to any of the others (despite the similarity of their components).
It is, of course, legal to
assign to a component any value with a conforming type. For example:
recl.a rec2.b rec3.a 5:
rec4.a recl.a; rec4.b recl.b;
Any single-component record type conforms to the
type of its single component, but not vice versa. The
automatic conversion
in this case requires no computation.
Example:
Bundle: TYPE = RECORD [value:
INTEGER]:
recEar: Bundle;
iniVar: INTEGER;
intVar 4-
recVar; --
means intVar 4-- recVar.value
intVar recVar+1; --
operand conversion
recVar F Bundle[intVat]; --
a constructor
recVar.value intVar,
This
conversion simplifies dealing with functions that return single-component
records (chapter 5). It also provides a way of partitioning a set of
variables that can be checked by the type system. In the
example above, a direct assignment of intVar to recVar
is invalid. Furthermore, no other single-component
record type, such as
Prime: TYPE = RECORD [value:
INTEGER];
can be confused with Bundle; assignment of a Bundle value to a Prime, or a Prime to a Bundle, is illegal.
Either a Bundle or
a Prime can,
however, appear as a numeric operand. Defining Bundle and Prime as synonyms for INTEGER would
not provide this additional checking.
Because of the uniqueness of constructed record
types, record variables are typically declared in two steps: first the record type, then the record variables. The general form is:
identifier : TYPE = RecordTC ; --
define record type.
IdList : identifier Initialization ; -- same identifier
as just defined
Record variables can also be declared directly:
IdList : RecordTC Initialization ;
This form is not
very useful because the (unnamable) record type is not available for purposes
such as declaring other records of the same type or writing
constructors.
The Initialization shown in these general forms
applies to the entire record variable, not to individual
components. Any Initialization must have the
proper record type. Initialization
of
record variables
is shown in the next example.
noon: AfilitatyTime = [hrs:12,
mins:0];
midnight: AfilitatyTime = [hrs:0,
mins:0];
time: AlilitaryTime 4- midnight; --
start time at
midnight.
Some
fine points:
The Mesa compiler
packs record components into machine words The components may be arranged in an
order that differs
from the left-to-right order of the fields in the type constructor. All records
of the same type have
the same component arrangement.
Norman.. the user
is unconcerned with the actual arrangement of record components. When component
arrangement is
important, the user may specify
"MACHINE DEPENDENT" records. An
example is:
InterruptWord: TYPE =
MACHINE DEPENDENT RECORD
device:
DeviceNumber,
channel: [0..7].
stopCode: {finishedOk. errorStop. powerOffl, command:
ChannelCommand
In this case. the user
takes full responsibility for component arrangement. Components are positioned
exactly as green.
from left to right in machine words. In general. "fill" components
are needed to ensure that no field crosses a word boundary (unless it starts on one).
Components (such as ChannelCommand) may themselves be aggregates occupying more than one
word
It
is also the user's responsibility to "fill out the record to a
full word if the record crosses a word boundar3. (Interrupt Word might be correct for a 16-bit
machine. but not for a machine having a larger word length).
Except
in MACHINE DEPENDENT records. components are packed for storaee efficiency.
Some fields ma} be aligned
(to the beginning of a word boundary) and some may not. Components occupying a
full word or more are
always aligned: arrays. INTEGERs
and pointers. for example. Subrecords may or may not be
aliened.
depending on their size. Packed arrays are always aliened.
even if there would have been space in the
preceding word for a b3te-sized
element.
The
function-like operator SIZE is often used to find the number of machine words
occupied by a record of some type. The general form is: SIZE [TypeSpecification]. The result is a
CARDINAL value. the number of words required by an object with the type specified by the argument.
SIZE may be used to find the number of words required for any type
of object.
3.3.3. Qualified references
Qualification is used to refer unambiguously to a named
component of some record. The general form (which extends the definition of a LeftSide) is
QualifiedReference = Variable . identifier I
( Expression ). identifier
LeftSide = I
QualifiedReference
The field name is said co be ''qualified by" the
record value (the Variable or
Expression) to the left
of the dot. The operator associates from left-to-right in the case of multiple
qualification. For example:
Latitude: TYPE = RECORD [degs:
[0..360), mins, secs: [0..60)];
Longitude: TYPE = RECORD [degs: I— 90..90], mins,
secs: [O..60)]:
Position: TYPE = RECORD [latitude: Latitude, longitude:
Longitude]:
somePosition: Position;
Some of the possible qualified references to components
of somePosition are
listed below:
Qualified Reference Refers
To
somePosition.latitude 1st
sub-record
somePosition.longitude 2nd
sub-record
sornePosition.latitude.degs 1st component of
1st sub-record
somePosition.longitude.secs 3rd
component of 2nd sub-record
The
association order for qualification means that names must occur in the proper
sequence: e.g.. somePositiolunins.longitude
is incorrect. Also, a qualified reference must be
complete, i.e., names may not be skipped (as in somePosition.secs, which
would be ambiguous in any event).
Qualified references and indexed references have the same
precedence (the highest possible) and may be intermixed. For
example:
recordOfArrays: RECORD [a.b: ARRAY
[0..100) OF CARDINAL];
arravOfRecords: ARRAY [1..5] OF RECORD CARDINAL];
• • •
array0fRecords[5].i3 record0fArrays.a[0]; -- ("last"
gets "first")
A fine point: Qualification briefly opens up a
given "name scope". For instance. in the record qualification, qualified name. x. must name a
field of rec and selects that field_ Scope is
treated more fully in |
rec.x. the chapter 7. |
3.3.4. Record coirstnictors
A record constructor assembles a record value
from a set of component values. In the following example.
a constructor is used
as a RightSide of an assignment.
Month:Vow: TYPE = {Jan, Feb. Mar, Apr. .11 ay. Jun..lul. Aug, .Sep. Oct. Nov. Dec}: Date: TYPE = RECORD
day: [1 .. 31].
month:
MonthName, year: [1900 .. 2000)
birth
Day: Date:
dd: [1..31]: mm: MonthName: Sy:
[1900..2000); now: [1900..2000)
1976: birthDay r Date[25,
Apr, now— 33];
This constructor yields a record value with type Date. The record assigned to birthDay contains the following
component values:
Component Value
day 25
month Apr
yearnow-33 (which
is 1943)
A Constructor is a Primary and may not be
used as the LeftSide of an assignment.
Record constructors are
of two kinds: keyword constructors and
positional constructors. Within
both kinds, the component value for a particular field may
either be supplied or be omitted. If it is omitted, the value
of the field is determined by the DefaultOption appearing in the declaration of
the
field (section 3.3.5).
... I Constructor
OptionalTypeld [ ConnponentList Typeldentifier I empty
KeywordComponentList I
PositionalComponentList
KeywordComponentList = KeywordComponent
KeywordComponentList
KeywordComponent
PositionalComponentList = Component
I
PositionalComponentList , Component
KeywordComponent = identifier : Component
Component empty
I -- elided component
Expression I -- explicit component
NULL -- voided component
The
initial Typeldentifier, if
present, must name the type of the record being constructed.
In keyword
constructors, the correspondence between constructor
components and record components is strictly "by name".
Keyword names may not be repeated
in a constructor, but the order is irrelevant. For example. the following keyword
constructors are equivalent:
Date[day: 25, month: Apr, year now— 33] Date[month: Apr,
day: 25,
year: now— 33]
All of these keyword constructors specify values for all
the components. In the following example. the first keyword
constructor elides the
month component
(the place for the component value is specified, but no value is
given): the second voids the
117011111 component
(by specifying NULL instead
of a value):
Date[da•:
25. month:,
rear: now— 33]
25, month: NULL. }•ear: now— 331
The
distinction between an elided and a voided a field arises in the treatment of
defaults (section 3.3.5). Since the declaration of Date specifies no default value
fir month, both
of these examples construct records with a second component having
an undefined value.
In a positional
constructor, the correspondence between constructor
components and record components is strictly "by position".
The first constructor component corresponds to the first record component,
the second value to the second component. etc. Positional constructors may be
used for both records and arrays (section 3.2.2). It does not
matter whether or not fields are named in the
definition of the record type. The following three constructors are
equivalent:
Date[day: 25, month: „rear: now— 33] --
value of month is
undefined
Date[25„ now-33] --
value of 2nd component is undefined (elided)
Date[25, NULL. now— 33] --
value of 2nd component is undefined (voided)
Positional
constructors may elide or void components as shown above. but components not
supplied at the end of the list must be elided by supplying a
sufficient number of trailing commas.
Keyword and positional 1101111i011.5 may not be mixed in a single constructor. The
order of evaluation of components is not specified for either kind
of constructor.
The initial Typeldentifier in a constructor
may be omitted when the constructor is used as: the RightSide of an assignment
(unless the LeftSide
is an extractor, section 3.3.5) an
expression in an Initialization
a component of an enclosing record or array constructor
an areument of a procedure
the right operand of a Relation.
In
other cases, an initial Typeldentifier must
appear. It is never incorrect to supply the identifier, and
sometimes doing so improves readability.
A
fine point:
Any record constructor
in which all components are compile-time constants is a compile-time constant.
Also, a field selected
from a record that is a compile-time
constant is itself a compile-time constant
3.3.5. Default field values
The definition of a record type may specify a default
value for each field. These default specifications are optional;
if present, they are used in constructing records of that type when no values
for the corresponding fields are specified in constructors. An elided field, as
discussed in the preceding section, supplies no value. Specifying
some default treatment of a field also allows complete omission of that field in a
constructor. In a keyword constructor, a field is omitted by omitting
the keyword entirely: in a positional constructor. trailing fields (only) can
be omitted by omitting the final commas. The positional
constructor "[ ]" is considered to omit, not elide, its first component
(if any).
In the following example, all constructors
have Interval: TYPE = RECORD range:
INTEGER. origin: INTEGER *- 0. direction: {up. down} r up Interval[range:
10. origin: 0. direction: up] Interval[range: 10. origin:. direction: ]
Interval[range: 10] nterval[10] |
the same nalue. all fields specified origin. direction elided origin. direction omitted origin. direction omitted (positional form) |
The
syntax for specifying defaults in a NamedFieldList follows: DefaultOption = empty I DefaultSpecification
DefaultSpecification = empty I
Expression
I
NULL
Expression I NULL
Note: In the final line, the vertical bar denotes itself
and is embedded within an alternative.
Only
named fields may have default values. In a DefaultSpecification, the Expression must
have a type that conforms to the type
of the corresponding field.
Suppose that R is a record type with a
field v of type T. The
above syntax allows five forms for the DefaultOption in the
declaration of v. No matter which form is used, a constructor of an R may
explicitly specify a value for the field v. The various
options control whether the existence of the field must be made
evident in the constructor, whether an explicit value must be supplied and. if not_
what action is taken. The options are interpreted as follows:
(1)
v: T
In a constructor, the value of v can be left
undefined, but that must be indicated explicitly, by eliding or voiding the field. This Wile also applies to
an unnamed field.
(2)
v: T
Every constructor must supply
an explicit value (not
NULL) for
v.
(3)
v: T t e
If
a constructor elides or omits v. the value of the expression e is
used: voiding the field is not permitted.
(4)
T <- NULL
As
in (1) above. except that the constructor may omit v entirely. If the field is
omitted, elided or voided, its value is undefined.
(5) v: T F e I NULL
As in (4) above, except that a constructor may
explicitly void v. If the field is omitted or elided. the value
of e is
used: if it is voided. its value is undefined.
If the first or second form is used, the field cannot be
omitted from a constructor: these forms are useful when such omission is likely
to indicate a programming
error. Omission is permitted by the other forms, which differ in the default
action for an omitted or elided field. These forms are appropriate
when a field has some common and meaningful
default value (the third and fifth cases) or. alternatively, is
relevant only in unusual circumstances (the fourth case). The last three forms
are particularly suitable for extending the definition of a
record type: constructors in existing programs need not be
modified.
I-a,:3-
Chapter 3: Common
Constructed Data Types
Fine
points:
The
second form of field declaration guarantees that the field r has a well-defined
value in am record created 1)i a constructor (but not
otherwise).
if
ihe Expression form of a default specification is used. that expression is
evaluated at the time of construction but in the context of the declaration of the record. i.e..
the expression is treated as a parameterless procedure invoked e\
aluation of the constructor (see chapter 5).
The
default value of a field cannot be specified in terms of other fields in the
same record. Default values for fields of record types defined
in DEFINITIONS modules (section 7.3.2) must be compile-time constants.
Examples: R: TYPE = RECORD I:
CARDINAL, v2: CARDINAL -, v3: CARDINAL 4- 3, v4: CARDINAL 4- NULL, CARDINAL 4- 5 I NULL -- the following, are valid R[v1:
1. v2:
2] R[v1: . v2: 2, v5:] R[vl: 1, v2: 2, v5: NULL] -- the following are not valid R[] R[v1: 1, v2: NULL,
v3: NULL] |
-- means R[v/: 1, v2: 2, v3: 3, thi: NULL. v.5: 5] -- means R[v , v2: 2, v3: 3, v4: NULL, v5: 5] -- means R[v1:
1, v2: 2, v3: 3, v4: NULL, v5: NULL] -- neither vl
nor v2 may be omitted -- neither v2 nor v3
may be voided |
3.3.6. Extractors
Extractors are used to "explode" record objects
and assign their components to individual variables in
a single statement. For
example, the extractor below assigns the components of birthDay (defined
in the previous
section) to the variables
dd, mm, and
yy, in that order:
[dd, mm, yy] birthDay:
This
has the same effect as the following three separate assignments, except that birthDay is evaluated only once:
dd birthDay.day:
mm birthDayanotah; yy birthDay.year
An extractor resembles a constructor in form. but
there are some
important differences:
An extractor may only be used as a LeftSide, never as an Expression.
The "components" of an extractor specify LeftSides,
not Expressions. Extractors always begin
with a left bracket, never with a Typeldentifier.
The type of the record value assigned to an extractor must
be known to the compiler. This means that the following (rather
useless) statement is invalid because the constructor's type cannot be determined:
[dd, mm, yy] [25, Apr, 1943]: -- invalid
The statement
should specify the type of the constructed value: [dd, mm, YY] 4- Date[25,
Apr, 194.3]: --
valid
Extractors,
like constructors. may use keywords. This allows an extractor to he written
without regard to the record's component order. For instance, the following
statements are equivalent to the
first
one in this section:
[day: dd, month: mm. year: yv] birthDay: [month: mm, day: dd. year: n'] birthDay:
Extractors
may elide or omit any item.
in vnhich
case the corresponding record component is not assigned. The extractors shown
below arc equivalent:
[day: dd. month: , year: jy]
F birthDay: [day: dd, year: yy]
4- birthDay; [dd. • yy] 4- birthDay: |
-- month elided -- month omitted -- 2nd component elided |
A
positional extractor may omit trailing components without supplying trailing, commas. The year component of birthDay is omitted below.
[dd, mm] birth Day:
An extraction operation (unlike an ordinary
assignment) yields no value. This
means that an extractor may not be
embedded within an expression. For example, the first
statement following is illegal: the
second is a valid
alternative:
r [x. y,
4 44. s; --
invalid
[x, y, 4 r s: --
valid
Syntax equations:
AssignmentStmt = I Extractor 4- RightSide
Extractor = KeywordExtractList I
[ PositionalExtractList
KeywordExtractList = KeywordExtract
KeywordExtract, KeywordExtractList
KeywordExtract •• - identifier : Extractltem
PositionalExtractList = ExtractItem I
Extractitem PositionalExtractList
Extractitem empty I -- component is ignored
LeftSide -- component is assigned
to LeftSide
The identifiers in a KeywordExtractList must be field
names for the record type. Note that an extraction list can be
empty, in which case the effect is to discard a record value. Extractors cannot be nested.
3.4. Pointers
Pointers allow efficient indirect access to objects. A
pointer may refer to only one specific type of item. For instance, the following pointer provides access only to
objects of type INTEGER:
intPtr: POINTER TO
INTEGER:
Another pointer might be specified to point only to BOOLEAN objects:
boolPtr: POINTER TO BOOLEAN:
These are different types of pointers since they
have different reference types, INTEGER and
BOOLEAN. Furthermore,
since INTEGER and
BOOLEAN are
incompatible types. these pointer types are also incompatible; i.e., assignment
of boolPtr to
intPtr, or
vice versa, is disallowed.
A pointer value-is represented by the
address of some data object. This object is called the pointer's referent. The postfix
operator T may be applied to a pointer value of any type to yield that value's referent. The process of "following" a pointer to its referent is
called dereferencing.
A dereferenced pointer designates a variable.
When the pointer is declared as above, the variable can be used as a LeftSide or as a Primary. Thus iniPtrr and boolPtrt are variables of type INTEGER and BOOLEAN respecti% ely. 'the statement
60011'1,1 (inthrt
= 0):
is executed by following iniPir to obtain a INTEGER value,
testing that value, and assigning the result to the BOOLEAN variable referenced by boolPir.
Sometimes a pointer is created simply to
identify an object or to allow indirect access to a value that is
not to be modified. Mesa provides readonly
pointers for such applications. A value with a readonly
pointer type cannot be used to update its referent. For
example, the declaration
ROiniPtr: POINTER TO READONLY INTEGER:
declares a readonly pointer. ROintPirt is a Primary with type INTEGER but
not a valid LeftSide.
Any type specification is permitted as the reference type
of a pointer type. The pointers declared below reference a named
record type.
Person: TYPE = RECORD
age: [0..200],
sex: {male, female},
party:
{Democratic, Republican}
1:
candidate!, candidate2:
Person:
winner, loser:
POINTER TO Person;
Pointers
to record objects may be used to qualify field names. if record candidate] is the referent of winner, then qualifications such as
wirnter:age winner.
sex winnerparty
select the corresponding components of candidate'. However, if candidate2 were the referent,
these same qualifications would select components of candidate2. When applied to a
pointer, the operation of selection implies dereferencing. For
example, winner.age specifies
dereferencing winner to
obtain a record variable of type Person
and then performing normal field selection on that record.
Thus winner.age
is an abbreviation of winnert.age.
It is common to define a record type containing components that are pointers referencing objects