This manual is a detailed description of the MIT Scheme runtime system. It is intended to be a reference document for programmers. It does not describe how to run Scheme or how to interact with it -- that is the subject of the MIT Scheme User's Manual.
This chapter summarizes the semantics of Scheme, briefly describes the MIT Scheme programming environment, and explains the syntactic and lexical conventions of the language. Subsequent chapters describe special forms, numerous data abstractions, and facilities for input and output.
Throughout this manual, we will make frequent references to standard Scheme, which is the language defined by the document Revised^4 Report on the Algorithmic Language Scheme, by William Clinger, Jonathan Rees, et al., or by IEEE Std. 1178-1990, IEEE Standard for the Scheme Programming Language (in fact, several parts of this document are copied from the Revised Report). MIT Scheme is an extension of standard Scheme.
These are the significant semantic characteristics of the Scheme language:
Scheme uses a parenthesized-list Polish notation to describe programs
and (other) data. The syntax of Scheme, like that of most Lisp
dialects, provides for great expressive power, largely due to its
simplicity. An important consequence of this simplicity is the
susceptibility of Scheme programs and data to uniform treatment by other
Scheme programs. As with other Lisp dialects, the read
primitive
parses its input; that is, it performs syntactic as well as lexical
decomposition of what it reads.
This section details the notational conventions used throughout the rest of this document.
When this manual uses the phrase "an error will be signalled," it
means that Scheme will call error
, which normally halts execution
of the program and prints an error message.
When this manual uses the phrase "it is an error," it means that the specified action is not valid in Scheme, but the system may or may not signal the error. When this manual says that something "must be," it means that violating the requirement is an error.
This manual gives many examples showing the evaluation of expressions. The examples have a common format that shows the expression being evaluated on the left hand side, an "arrow" in the middle, and the value of the expression written on the right. For example:
(+ 1 2) => 3
Sometimes the arrow and value will be moved under the expression, due to lack of space. Occasionally we will not care what the value is, in which case both the arrow and the value are omitted.
If an example shows an evaluation that results in an error, an error message is shown, prefaced by `error-->':
(+ 1 'foo) error--> Illegal datum
An example that shows printed output marks it with `-|':
(begin (write 'foo) 'bar) -| foo => bar
When this manual indicates that the value returned by some expression is unspecified, it means that the expression will evaluate to some object without signalling an error, but that programs should not depend on the value in any way.
Each description of an MIT Scheme variable, special form, or procedure begins with one or more header lines in this format:
@deffnexample category template
where category specifies the kind of item ("variable", "special form", or "procedure"), and how the item conforms to standard Scheme, as follows:
The form of template is interpreted depending on category.
else
keyword in
the cond
special form, are set in a fixed width font in the
printed manual; in the Info file they are not distinguished.
Parentheses indicate themselves.
A horizontal ellipsis (...) is describes repeated components.
Specifically,
thing ...indicates zero or more occurrences of thing, while
thing thing ...indicates one or more occurrences of thing. Brackets,
[ ]
, enclose optional components.
Several special forms (e.g. lambda
) have an internal component
consisting of a series of expressions; usually these expressions are
evaluated sequentially under conditions that are specified in the
description of the special form. This sequence of expressions is commonly
referred to as the body of the special form.
cdr
takes one argument,
which must be a pair.
Many procedures signal an error when an argument is of the wrong type;
usually this error is a condition of type
condition-type:wrong-type-argument
.
In addition to the standard data-type names (pair, list,
boolean, string, etc.), the following names as arguments
also imply type restrictions:
Some examples:
@deffnexample procedure list object ...
indicates that the standard Scheme procedure list
takes zero or
more arguments, each of which may be any Scheme object.
@deffnexample procedure write-char char [output-port]
indicates that the standard Scheme procedure write-char
must be
called with a character, char, and may also be called with a
character and an output port.
Any identifier that is not a syntactic keyword may be used as a variable (see section Identifiers). A variable may name a location where a value can be stored. A variable that does so is said to be bound to the location. The value stored in the location to which a variable is bound is called the variable's value. (The variable is sometimes said to name the value or to be bound to the value.)
A variable may be bound but still not have a value; such a variable is
said to be unassigned. Referencing an unassigned variable is an
error. When this error is signalled, it is a condition of type
condition-type:unassigned-variable
; sometimes the compiler does
not generate code to signal the error. Unassigned variables are useful
only in combination with side effects (see section Assignments).
An environment is a set of variable bindings. If an environment
has no binding for a variable, that variable is said to be unbound
in that environment. Referencing an unbound variable signals a
condition of type condition-type:unbound-variable
.
A new environment can be created by extending an existing
environment with a set of new bindings. Note that "extending an
environment" does not modify the environment; rather, it
creates a new environment that contains the new bindings and the old
ones. The new bindings shadow the old ones; that is, if an
environment that contains a binding for x
is extended with a new
binding for x
, then only the new binding is seen when x
is
looked up in the extended environment. Sometimes we say that the
original environment is the parent of the new one, or that the new
environment is a child of the old one, or that the new environment
inherits the bindings in the old one.
Procedure calls extend an environment, as do let
, let*
,
letrec
, and do
expressions. Internal definitions
(see section Internal Definitions) also extend an environment. (Actually,
all the constructs that extend environments can be expressed in terms of
procedure calls, so there is really just one fundamental mechanism for
environment extension.)
A top-level definition (see section Top-Level Definitions) may add a binding to an existing environment.
MIT Scheme provides an initial environment that contains all of the variable bindings described in this manual. Most environments are ultimately extensions of this initial environment. In Scheme, the environment in which your programs execute is actually a child (extension) of the environment containing the system's bindings. Thus, system names are visible to your programs, but your names do not interfere with system programs.
The environment in effect at some point in a program is called the
current environment at that point. In particular, every REP
loop has a current environment. (REP stands for
"read-eval-print"; the REP loop is the Scheme program that reads
your input, evaluates it, and prints the result.) The environment of
the top-level REP loop (the one you are in when Scheme starts up)
starts as user-initial-environment
, although it can be changed by
the ge
procedure. When a new REP loop is created, its
environment is determined by the program that creates it.
Scheme is a statically scoped language with block structure. In this respect, it is like Algol and Pascal, and unlike most other dialects of Lisp except for Common Lisp.
The fact that Scheme is statically scoped (rather than dynamically bound) means that the environment that is extended (and becomes current) when a procedure is called is the environment in which the procedure was created (i.e. in which the procedure's defining lambda expression was evaluated), not the environment in which the procedure is called. Because all the other Scheme binding expressions can be expressed in terms of procedures, this determines how all bindings behave.
Consider the following definitions, made at the top-level REP loop (in the initial environment):
(define x 1) (define (f x) (g 2)) (define (g y) (+ x y)) (f 5) => 3 ; not 7
Here f
and g
are bound to procedures created in the
initial environment. Because Scheme is statically scoped, the call to
g
from f
extends the initial environment (the one in which
g
was created) with a binding of y
to 2
. In this
extended environment, y
is 2
and x
is 1
.
(In a dynamically bound Lisp, the call to g
would extend the
environment in effect during the call to f
, in which x
is
bound to 5
by the call to f
, and the answer would be
7
.)
Note that with static scoping, you can tell what binding a variable
reference refers to just from looking at the text of the program; the
referenced binding cannot depend on how the program is used. That is,
the nesting of environments (their parent-child relationship)
corresponds to the nesting of binding expressions in program text.
(Because of this connection to the text of the program, static scoping
is also called lexical scoping.) For each place where a variable
is bound in a program there is a corresponding region of the
program text within which the binding is effective. For example, the
region of a binding established by a lambda
expression is the
entire body of the lambda
expression. The documentation of each
binding expression explains what the region of the bindings it makes is.
A use of a variable (that is, a reference to or assignment of a
variable) refers to the innermost binding of that variable whose region
contains the variable use. If there is no such region, the use refers
to the binding of the variable in the global environment (which is an
ancestor of all other environments, and can be thought of as a region in
which all your programs are contained).
In Scheme, the boolean values true and false are denoted by #t
and #f
. However, any Scheme value can be treated as a boolean
for the purpose of a conditional test. This manual uses the word
true to refer to any Scheme value that counts as true, and the
word false to refer to any Scheme value that counts as false. In
conditional tests, all values count as true except for #f
, which
counts as false (see section Conditionals).
Implementation note: In MIT Scheme, #f
and the empty list
are the same object, and the printed representation of #f
is
always `()'. As this contradicts the Scheme standard, MIT
Scheme will soon be changed to make #f
and the empty list
different objects.
An important concept in Scheme is that of the external representation of an object as a sequence of characters. For example, an external representation of the integer 28 is the sequence of characters `28', and an external representation of a list consisting of the integers 8 and 13 is the sequence of characters `(8 13)'.
The external representation of an object is not necessarily unique. The integer 28 also has representations `#e28.000' and `#x1c', and the list in the previous paragraph also has the representations `( 08 13 )' and `(8 . (13 . ( )))'.
Many objects have standard external representations, but some, such as procedures and circular data structures, do not have standard representations (although particular implementations may define representations for them).
An external representation may be written in a program to obtain the corresponding object (see section Quoting).
External representations can also be used for input and output. The
procedure read
parses external representations, and the procedure
write
generates them. Together, they provide an elegant and
powerful input/output facility.
Note that the sequence of characters `(+ 2 6)' is not an
external representation of the integer 8, even though it is an
expression that evaluates to the integer 8; rather, it is an external
representation of a three-element list, the elements of which are the
symbol +
and the integers 2
and 6
. Scheme's syntax
has the property that any sequence of characters that is an expression
is also the external representation of some object. This can lead to
confusion, since it may not be obvious out of context whether a given
sequence of characters is intended to denote data or program, but it is
also a source of power, since it facilitates writing programs such as
interpreters and compilers that treat programs as data or data as
programs.
Every object satisfies at most one of the following predicates (but see section True and False, for an exception):
bit-string? environment? pathname? string? boolean? null? port? symbol? cell? number? procedure? vector? char? pair? promise? weak-pair? condition?
This section describes a model that can be used to understand Scheme's use of storage.
Variables and objects such as pairs, vectors, and strings implicitly
denote locations or sequences of locations. A string, for example,
denotes as many locations as there are characters in the string. (These
locations need not correspond to a full machine word.) A new value may
be stored into one of these locations using the string-set!
procedure, but the string continues to denote the same locations as
before.
An object fetched from a location, by a variable reference or by a
procedure such as car
, vector-ref
, or string-ref
,
is equivalent in the sense of eqv?
to the object last stored in
the location before the fetch.
Every location is marked to show whether it is in use. No variable or object ever refers to a location that is not in use. Whenever this document speaks of storage being allocated for a variable or object, what is meant is that an appropriate number of locations are chosen from the set of locations that are not in use, and the chosen locations are marked to indicate that they are now in use before the variable or object is made to denote them.
In many systems it is desirable for constants (i.e. the values of
literal expressions) to reside in read-only memory. To express this, it
is convenient to imagine that every object that denotes locations is
associated with a flag telling whether that object is mutable or
immutable. The constants and the strings returned by
symbol->string
are then the immutable objects, while all objects
created by other procedures are mutable. It is an error to attempt to
store a new value into a location that is denoted by an immutable
object. Note that the MIT Scheme compiler takes advantage of this
property to share constants, but that these constants are not immutable.
Instead, two constants that are equal?
may be eq?
in
compiled code.
This section describes Scheme's lexical conventions.
Whitespace characters are spaces, newlines, tabs, and page breaks. Whitespace is used to improve the readability of your programs and to separate tokens from each other, when necessary. (A token is an indivisible lexical unit such as an identifier or number.) Whitespace is otherwise insignificant. Whitespace may occur between any two tokens, but not within a token. Whitespace may also occur inside a string, where it is significant.
All whitespace characters are delimiters. In addition, the following characters act as delimiters:
( ) ; " ' ` |
Finally, these next characters act as delimiters, despite the fact that Scheme does not define any special meaning for them:
[ ] { }
For example, if the value of the variable name
is
"max"
:
(list"Hi"name(+ 1 2)) => ("Hi" "max" 3)
An identifier is a sequence of one or more non-delimiter characters. Identifiers are used in several ways in Scheme programs:
Scheme accepts most of the identifiers that other programming languages allow. MIT Scheme allows all of the identifiers that standard Scheme does, plus many more.
MIT Scheme defines a potential identifier to be a sequence of non-delimiter characters that does not begin with either of the characters `#' or `,'. Any such sequence of characters that is not a syntactically valid number (see section Numbers) is considered to be a valid identifier. Note that, although it is legal for `#' and `,' to appear in an identifier (other than in the first character position), it is poor programming practice.
Here are some examples of identifiers:
lambda q list->vector soup + V17a <=? a34kTMNs the-word-recursion-has-many-meanings
Scheme doesn't distinguish uppercase and lowercase forms of a letter except within character and string constants; in other words, Scheme is case-insensitive. For example, `Foo' is the same identifier as `FOO', and `#x1AB' is the same number as `#X1ab'. But `#\a' and `#\A' are different characters.
A predicate is a procedure that always returns a boolean value
(#t
or #f
). By convention, predicates usually have names
that end in `?'.
A mutation procedure is a procedure that alters a data structure. By convention, mutation procedures usually have names that end in `!'.
The beginning of a comment is indicated with a semicolon (;
).
Scheme ignores everything on a line in which a semicolon appears, from
the semicolon until the end of the line. The entire comment, including
the newline character that terminates it, is treated as
whitespace.
An alternative form of comment (sometimes called an extended comment) begins with the characters `#|' and ends with the characters `|#'. This alternative form is an MIT Scheme extension. As with ordinary comments, all of the characters in an extended comment, including the leading `#|' and trailing `|#', are treated as whitespace. Comments of this form may extend over multiple lines, and additionally may be nested (unlike the comments of the programming language C, which have a similar syntax).
;;; This is a comment about the FACT procedure. Scheme ;;; ignores all of this comment. The FACT procedure computes ;;; the factorial of a non-negative integer. #| This is an extended comment. Such comments are useful for commenting out code fragments. |# (define fact (lambda (n) (if (= n 0) ;This is another comment: 1 ;Base case: return 1 (* n (fact (- n 1))))))
The following list describes additional notations used in Scheme. See section Numbers for a description of the notations used for numbers.
+ - .
( )
"
\
;
'
`
,
,@
#
#t #f
#\
#(
#e #i #b #o #d #x
#|
#!
#!optional
and
#!rest
, both of which are used in the lambda
special form
to mark certain parameters as being "optional" or "rest" parameters.
This notation is an MIT Scheme extension.
#*
A Scheme expression is a construct that returns a value. An expression may be a literal, a variable reference, a special form, or a procedure call.
Literal constants may be written by using an external representation of the data. In general, the external representation must be quoted (see section Quoting); but some external representations can be used without quotation.
"abc" => "abc" 145932 => 145932 #t => #t #\a => #\a
The external representation of numeric constants, string constants, character constants, and boolean constants evaluate to the constants themselves. Symbols, pairs, lists, and vectors require quoting.
An expression consisting of an identifier (see section Identifiers) is a variable reference; the identifier is the name of the variable being referenced. The value of the variable reference is the value stored in the location to which the variable is bound. An error is signalled if the referenced variable is unbound or unassigned.
(define x 28) x => 28
(keyword component ...)
A parenthesized expression that starts with a syntactic keyword is a special form. Each special form has its own syntax, which is described later in the manual. The following list contains all of the syntactic keywords that are defined when MIT Scheme is initialized:
access define-syntax macro and delay make-environment begin do named-lambda bkpt fluid-let or case if quasiquote cond in-package quote cons-stream lambda scode-quote declare let sequence default-object? let* set! define let-syntax the-environment define-integrable letrec unassigned? define-macro local-declare using-syntax define-structure
(operator operand ...)
A procedure call is written by simply enclosing in parentheses expressions for the procedure to be called (the operator) and the arguments to be passed to it (the operands). The operator and operand expressions are evaluated and the resulting procedure is passed the resulting arguments. See section Lambda Expressions, for a more complete description of this.
Another name for the procedure call expression is combination. This word is more specific in that it always refers to the expression; "procedure call" sometimes refers to the process of calling a procedure.
Unlike some other dialects of Lisp, Scheme always evaluates the operator expression and the operand expressions with the same evaluation rules, and the order of evaluation is unspecified.
(+ 3 4) => 7 ((if #f = *) 3 4) => 12
A number of procedures are available as the values of variables in the
initial environment; for example, the addition and multiplication
procedures in the above examples are the values of the variables
+
and *
. New procedures are created by evaluating
lambda
expressions.
If the operator is a syntactic keyword, then the expression is not
treated as a procedure call: it is a special form. Thus you should not
use syntactic keywords as procedure names. If you were to bind one of
these keywords to a procedure, you would have to use apply
to
call the procedure. MIT Scheme signals an error when such a
binding is attempted.