Go to the first, previous, next, last section, table of contents.

File-System Interface

The Scheme standard provides a simple mechanism for reading and writing files: file ports. MIT Scheme provides additional tools for dealing with other aspects of the file system:

Pathnames

MIT Scheme programs need to use names to designate files. The main difficulty in dealing with names of files is that different file systems have different naming formats for files. For example, here is a table of several file systems (actually, operating systems that provide file systems) and what equivalent file names might look like for each one:

System          File Name
------          ---------
TOPS-20         <LISPIO>FORMAT.FASL.13
TOPS-10         FORMAT.FAS[1,4]
ITS             LISPIO;FORMAT FASL
MULTICS         >udd>LispIO>format.fasl
TENEX           <LISPIO>FORMAT.FASL;13
VAX/VMS         [LISPIO]FORMAT.FAS;13
UNIX            /usr/lispio/format.fasl
MS-DOS          C:\USR\LISPIO\FORMAT.FAS

It would be impossible for each program that deals with file names to know about each different file name format that exists; a new operating system to which Scheme was ported might use a format different from any of its predecessors. Therefore, MIT Scheme provides two ways to represent file names: filenames (also called namestrings), which are strings in the implementation-dependent form customary for the file system, and pathnames, which are special abstract data objects that represent file names in an implementation-independent way. Procedures are provided to convert between these two representations, and all manipulations of files can be expressed in machine-independent terms by using pathnames.

In order to allow MIT Scheme programs to operate in a network environment that may have more than one kind of file system, the pathname facility allows a file name to specify which file system is to be used. In this context, each file system is called a host, in keeping with the usual networking terminology.(26)

Note that the examples given in this section are specific to unix pathnames. Pathnames for other operating systems have different external representations.

Filenames and Pathnames

Pathname objects are usually created by parsing filenames (character strings) into component parts. MIT Scheme provides operations that convert filenames into pathnames and vice versa.

procedure+: ->pathname object
Returns a pathname that is the equivalent of object. Object must be a pathname or a string. If object is a pathname, it is returned. If object is a string, this procedure returns the pathname that corresponds to the string; in this case it is equivalent to (parse-namestring object #f #f).

(->pathname "foo")              =>  #[pathname 65 "foo"]
(->pathname "/usr/morris")      =>  #[pathname 66 "/usr/morris"]

procedure+: parse-namestring thing [host [defaults]]
This turns thing into a pathname. Thing must be a pathname or a string. If thing is a pathname, it is returned. If thing is a string, this procedure returns the pathname that corresponds to the string, parsed according to the syntax of the file system specified by host.

This procedure does not do defaulting of pathname components.

The optional arguments are used to determine what syntax should be used for parsing the string. In general this is only really useful if your implementation of MIT Scheme supports more than one file system, otherwise you would use ->pathname. If given, host must be a host object or #f, and defaults must be a pathname. Host specifies the syntax used to parse the string. If host is not given or #f, the host component from defaults is used instead; if defaults is not given, the host component from *default-pathname-defaults* is used.

procedure+: ->namestring pathname
->namestring returns a newly allocated string that is the filename corresponding to pathname.
(->namestring (->pathname "/usr/morris/minor.van"))
     =>  "/usr/morris/minor.van"

procedure+: pathname-simplify pathname
Returns a pathname that locates the same file or directory as pathname, but is in some sense simpler. Note that pathname-simplify might not always be able to simplify the pathname, e.g. on unix with symbolic links the directory `/usr/morris/../' need not be the same as `/usr/'. In cases of uncertainty the behavior is conservative, returning the original or a partly simplified pathname.
(pathname-simplify "/usr/morris/../morris/dance")
     =>  #[pathname "/usr/morris/dance"]

Components of Pathnames

A pathname object always has six components, described below. These components are the common interface that allows programs to work the same way with different file systems; the mapping of the pathname components into the concepts peculiar to each file system is taken care of by the Scheme implementation.

host
The name of the file system on which the file resides.
device
Corresponds to the "device" or "file structure" concept in many host file systems: the name of a (logical or physical) device containing files.
directory
Corresponds to the "directory" concept in many host file systems: the name of a group of related files (typically those belonging to a single user or project).
name
The name of a group of files that can be thought of as conceptually the "same" file.
type
Corresponds to the "filetype" or "extension" concept in many host file systems. This says what kind of file this is. Files with the same name but different type are usually related in some specific way, such as one being a source file, another the compiled form of that source, and a third the listing of error messages from the compiler.
version
Corresponds to the "version number" concept in many host file systems. Typically this is a number that is incremented every time the file is modified.

Note that a pathname is not necessarily the name of a specific file. Rather, it is a specification (possibly only a partial specification) of how to access a file. A pathname need not correspond to any file that actually exists, and more than one pathname can refer to the same file. For example, the pathname with a version of newest may refer to the same file as a pathname with the same components except a certain number as the version. Indeed, a pathname with version newest may refer to different files as time passes, because the meaning of such a pathname depends on the state of the file system. In file systems with such facilities as "links", multiple file names, logical devices, and so on, two pathnames that look quite different may turn out to address the same file. To access a file given a pathname, one must do a file-system operation such as open-input-file.

Two important operations involving pathnames are parsing and merging. Parsing is the conversion of a filename (which might be something supplied interactively by the users when asked to supply the name of a file) into a pathname object. This operation is implementation-dependent, because the format of filenames is implementation-dependent. Merging takes a pathname with missing components and supplies values for those components from a source of default values.

Not all of the components of a pathname need to be specified. If a component of a pathname is missing, its value is #f. Before the file system interface can do anything interesting with a file, such as opening the file, all the missing components of a pathname must be filled in. Pathnames with missing components are used internally for various purposes; in particular, parsing a namestring that does not specify certain components will result in a pathname with missing components.

Any component of a pathname may be the symbol unspecific, meaning that the component simply does not exist, for file systems in which such a value makes no sense. For example, unix and MS-DOS file systems usually do not support version numbers, so the version component for a unix or MS-DOS host might be unspecific.(27)

Each component in a pathname is typically one of the following (with some exceptions that will be described below):

a string
This is a literal component. It is considered to be fully specified.
#f
This is a missing component. It is considered to be unspecified.
wild
This is a wildcard component. It is useful only when the pathname is being used with the directory reader, where it means that the pathname component matches anything.
unspecific
This is an unspecifiable component. It is treated the same as a missing component except that it is not considered to be missing for purposes of merging or defaulting components.

The host, directory, and version pathname components are exceptions to these rules in that they may never be strings, although the values #f, wild, and unspecific are allowed with their usual meanings. Here are the other values allowed for these components:

procedure+: make-pathname host device directory name type version
Returns a pathname object whose components are the respective arguments. Each argument must satisfy the restrictions for the corresponding component, which were outlined above.

(make-pathname #f #f '(absolute "usr" "morris") "foo" "scm" #f)
     =>  #[pathname 67 "/usr/morris/foo.scm"]

procedure+: pathname-host pathname
procedure+: pathname-device pathname
procedure+: pathname-directory pathname
procedure+: pathname-name pathname
procedure+: pathname-type pathname
procedure+: pathname-version pathname
Returns a particular component of pathname.

(define x (->pathname "/usr/morris/foo.scm"))
(pathname-host x)       =>  #[host 1]
(pathname-device x)     =>  unspecific
(pathname-directory x)  =>  (absolute "usr" "morris")
(pathname-name x)       =>  "foo"
(pathname-type x)       =>  "scm"
(pathname-version x)    =>  unspecific

procedure+: pathname-new-device pathname device
procedure+: pathname-new-directory pathname directory
procedure+: pathname-new-name pathname name
procedure+: pathname-new-type pathname type
procedure+: pathname-new-version pathname version
Returns a new copy of pathname with the respective component replaced by the second argument. Pathname is unchanged. Portable programs should not explicitly replace a component with unspecific because this might not be permitted in some situations.

(define p (->pathname "/usr/blisp/rel15"))
p
     =>  #[pathname 71 "/usr/blisp/rel15"]
(pathname-new-name p "rel100")
     =>  #[pathname 72 "/usr/blisp/rel100"]
(pathname-new-directory p '(relative "test" "morris"))
     =>  #[pathname 73 "test/morris/rel15"]
p
     =>  #[pathname 71 "/usr/blisp/rel15"]

procedure+: pathname-default-device pathname device
procedure+: pathname-default-directory pathname directory
procedure+: pathname-default-name pathname name
procedure+: pathname-default-type pathname type
procedure+: pathname-default-version pathname version
These operations are similar to the pathname-new-component operations, except that they only change the specified component if it has the value #f in pathname.

Operations on Pathnames

procedure+: pathname? object
Returns #t if object is a pathname; otherwise returns #f.

procedure+: pathname=? pathname1 pathname2
Returns #t if pathname1 is equivalent to pathname2; otherwise returns #f. Pathnames are equivalent if all of their components are equivalent, hence two pathnames that are equivalent must identify the same file or equivalent partial pathnames. However, the converse is not true: non-equivalent pathnames may specify the same file (e.g. via absolute and relative directory components), and pathnames that specify no file at all (e.g. name and directory components unspecified) may be equivalent.

procedure+: pathname-absolute? pathname
Returns #t if pathname is an absolute rather than relative pathname object; otherwise returns #f. All pathnames are either absolute or relative, so if this procedure returns #f, the argument is a relative pathname.

procedure+: pathname-wild? pathname
Returns #t if pathname contains any wildcard components; otherwise returns #f.

procedure+: merge-pathnames pathname [defaults [default-version]]
Returns a pathname whose components are obtained by combining those of pathname and defaults. Defaults defaults to the value of *default-pathname-defaults* and default-version defaults to newest.

The pathnames are combined by components: if pathname has a non-missing component, that is the resulting component, otherwise the component from defaults is used. The default version can be #f to preserve the information that the component was missing from pathname. The directory component is handled specially: if both pathnames have directory components that are lists, and the directory component from pathname is relative (i.e. starts with relative), then the resulting directory component is formed by appending pathname's component to defaults's component. For example:

(define path1 (->pathname "scheme/foo.scm"))
(define path2 (->pathname "/usr/morris"))
path1
     =>  #[pathname 74 "scheme/foo.scm"]
path2
     =>  #[pathname 75 "/usr/morris"]
(merge-pathnames path1 path2)
     =>  #[pathname 76 "/usr/scheme/foo.scm"]
(merge-pathnames path2 path1)
     =>  #[pathname 77 "/usr/morris.scm"]

The merging rules for the version are more complex and depend on whether pathname specifies a name. If pathname does not specify a name, then the version, if not provided, will come from defaults. However, if pathname does specify a name then the version is not affected by defaults. The reason is that the version "belongs to" some other file name and is unlikely to have anything to do with the new one. Finally, if this process leaves the version missing, then default-version is used.

The net effect is that if the user supplies just a name, then the host, device, directory and type will come from defaults, but the version will come from default-version. If the user supplies nothing, or just a directory, the name, type and version will come over from defaults together.

variable+: *default-pathname-defaults*
This is the default pathname-defaults pathname; if any pathname primitive that needs a set of defaults is not given one, it uses this one. set-working-directory-pathname! sets this variable to a new value, computed by merging the new working directory with the variable's old value.

procedure+: pathname-default pathname device directory name type version
This procedure defaults all of the components of pathname simultaneously. It could have been defined by:

(define (pathname-default pathname
                          device directory name type version)
  (make-pathname (pathname-host pathname)
                 (or (pathname-device pathname) device)
                 (or (pathname-directory pathname) directory)
                 (or (pathname-name pathname) name)
                 (or (pathname-type pathname) type)
                 (or (pathname-version pathname) version)))

procedure+: file-namestring pathname
procedure+: directory-namestring pathname
procedure+: host-namestring pathname
procedure+: enough-namestring pathname [defaults]
These procedures return a string corresponding to a subset of the pathname information. file-namestring returns a string representing just the name, type and version components of pathname; the result of directory-namestring represents just the host, device, and directory components; and host-namestring returns a string for just the host portion.

enough-namestring takes another argument, defaults. It returns an abbreviated namestring that is just sufficient to identify the file named by pathname when considered relative to the defaults (which defaults to *default-pathname-defaults*).

(file-namestring "/usr/morris/minor.van")
     =>  "minor.van"
(directory-namestring "/usr/morris/minor.van")
     =>  "/usr/morris/"
(enough-namestring "/usr/morris/men")
     =>  "men"      ;perhaps

procedure+: file-pathname pathname
procedure+: directory-pathname pathname
procedure+: enough-pathname pathname [defaults]
These procedures return a pathname corresponding to a subset of the pathname information. file-pathname returns a pathname with just the name, type and version components of pathname. The result of directory-pathname is a pathname containing the host, device and directory components of pathname.

enough-pathname takes another argument, defaults. It returns an abbreviated pathname that is just sufficient to identify the file named by pathname when considered relative to the defaults (which defaults to *default-pathname-defaults*).

These procedures are similar to file-namestring, directory-namestring and enough-namestring, but they return pathnames instead of strings.

procedure+: directory-pathname-as-file pathname
Returns a pathname that is equivalent to pathname, but in which the directory component is represented as a file. The last directory is removed from the directory component and converted into name and type components. This is the inverse operation to pathname-as-directory.
(directory-pathname-as-file (->pathname "/usr/blisp/"))
     =>  #[pathname "/usr/blisp"]

procedure+: pathname-as-directory pathname
Returns a pathname that is equivalent to pathname, but in which any file components have been converted to a directory component. If pathname does not have name, type, or version components, it is returned without modification. Otherwise, these file components are converted into a string, and the string is added to the end of the list of directory components. This is the inverse operation to directory-pathname-as-file.

(pathname-as-directory (->pathname "/usr/blisp/rel5"))
     =>  #[pathname "/usr/blisp/rel5/"]

Miscellaneous Pathname Procedures

This section gives some standard operations on host objects, and some procedures that return some useful pathnames.

variable+: local-host
This variable has as its value the host object that describes the local host's file system.

procedure+: host? object
Returns #t if object is a pathname host; otherwise returns #f.

procedure+: host=? host1 host2
Returns #t if host1 and host2 denote the same pathname host; otherwise returns #f.

procedure+: init-file-pathname [host]
Returns a pathname for the user's initialization file on host. The host argument defaults to the value of local-host. If the initialization file does not exist this procedure returns #f.

procedure+: user-homedir-pathname [host]
Returns a pathname for the user's "home directory" on host. The host argument defaults to the value of local-host. The concept of a "home directory" is itself somewhat implementation-dependent, but it should be the place where the user keeps personal files, such as initialization files and mail. For example, on unix this is the user's unix home directory, whereas on MS-DOS the home directory is determined from the HOME, USER and USERDIR environment variables.

procedure+: system-library-pathname pathname
Locates pathname in MIT Scheme's system library directory. An error of type condition-type:file-operation-error is signalled if pathname cannot be located on the library search path.

(system-library-pathname "compiler.com")
     => #[pathname 45 "/usr/local/lib/mit-scheme/compiler.com"]

procedure+: system-library-directory-pathname pathname
Locates the pathname of a MIT Scheme system library directory. An error of type condition-type:file-operation-error is signalled if pathname cannot be located on the library search path.

(system-library-directory-pathname "options")
     => #[pathname 44 "/usr/local/lib/mit-scheme/options/"]

Working Directory

When MIT Scheme is started, the current working directory (or simply, working directory) is initialized in an operating-system dependent manner; usually, it is the directory in which Scheme was invoked. The working directory can be determined from within Scheme by calling the pwd procedure, and changed by calling the cd procedure. Each REP loop has its own working directory, and inferior REP loops initialize their working directory from the value in effect in their superior at the time they are created.

procedure+: working-directory-pathname
procedure+: pwd
Returns the current working directory as a pathname that has no name, type, or version components, just host, device, and directory components. pwd is an alias for working-directory-pathname; the long name is intended for programs and the short name for interactive use.

procedure+: set-working-directory-pathname! filename
procedure+: cd filename
Makes filename the current working directory and returns the new current working directory as a pathname. Filename is coerced to a pathname using pathname-as-directory. cd is an alias for set-working-directory-pathname!; the long name is intended for programs and the short name for interactive use.

Additionally, set-working-directory-pathname! modifies the value of *default-pathname-defaults* by merging the new working directory into it.

In the unix implementation, when this procedure is executed in the top-level REP loop, it changes the working directory of the running Scheme executable.

(set-working-directory-pathname! "/usr/morris/blisp")
     =>  #[pathname "/usr/morris/blisp/"]
(set-working-directory-pathname! "~")
     =>  #[pathname "/usr/morris/"]

This procedure signals an error if filename does not refer to an existing directory.

If filename describes a relative rather than absolute pathname, this procedure interprets it as relative to the current working directory, before changing the working directory.

(working-directory-pathname)
     =>  #[pathname "/usr/morris/"]
(set-working-directory-pathname! "foo")
     =>  #[pathname "/usr/morris/foo/"]

procedure+: with-working-directory-pathname filename thunk
This procedure temporarily rebinds the current working directory to filename, invokes thunk (a procedure of no arguments), then restores the previous working directory and returns the value yielded by thunk. Filename is coerced to a pathname using pathname-as-directory. In addition to binding the working directory, with-working-directory-pathname also binds the variable *default-pathname-defaults*, merging the old value of that variable with the new working directory pathname. Both bindings are performed in exactly the same way as fluid binding of a variable (see section Fluid Binding).

File Manipulation

This section describes procedures that manipulate files and directories. Any of these procedures can signal a number of errors for many reasons. The specifics of these errors are much too operating-system dependent to document here. However, if such an error is signalled by one of these procedures, it will be of type condition-type:file-operation-error.

procedure+: file-exists? filename
Returns #t if filename is an existing file or directory; otherwise returns #f. In operating systems that support symbolic links, if the file is a symbolic link, this procedure tests the existence of the file linked to, not the link itself.

procedure+: copy-file source-filename target-filename
Makes a copy of the file named by source-filename. The copy is performed by creating a new file called target-filename, and filling it with the same data as source-filename. If target-filename exists prior to this procedure's invocation, it is deleted before the new output file is created.

procedure+: rename-file source-filename target-filename
Changes the name of source-filename to be target-filename. In the unix implementation, this will not rename across file systems.

procedure+: delete-file filename
Deletes the file named filename.

procedure+: ->truename filename
This procedure attempts to discover and return the "true name" of the file associated with filename within the file system. An error of type condition-type:file-operation-error is signalled if the appropriate file cannot be located within the file system.

procedure+: call-with-temporary-filename procedure
call-with-temporary-filename generates a temporary filename, and calls procedure with that filename as its sole argument. The filename is guaranteed not to refer to any existing file, and, barring unusual circumstances, it can be used to open an output file without error. When procedure returns, if the file referred to by the filename exists, it is deleted; then, the value yielded by procedure is returned. If procedure escapes from its continuation, and the file referred to by the filename exists, it is deleted.

procedure+: file-directory? filename
Returns #t if the file named filename exists and is a directory. Otherwise returns #f. In operating systems that support symbolic links, if filename names a symbolic link, this examines the file linked to, not the link itself.

procedure+: file-symbolic-link? filename
In operating systems that support symbolic links, if the file named filename exists and is a symbolic link, this procedure returns the contents of the symbolic link as a newly allocated string. The returned value is the name of the file that the symbolic link points to and must be interpreted relative to the directory of filename. If filename either does not exist or is not a symbolic link, or if the operating system does not support symbolic links, this procedure returns #f.

procedure+: file-readable? filename
Returns #t if filename names a file that can be opened for input; i.e. a readable file. Otherwise returns #f.

procedure+: file-writable? filename
Returns #t if filename names a file that can be opened for output; i.e. a writable file. Otherwise returns #f.

procedure+: file-access filename mode
Mode must be an exact integer between 0 and 7 inclusive; it is a bitwise-encoded predicate selector with 1 meaning "executable", 2 meaning "writable", and 4 meaning "readable". file-access returns #t if filename exists and satisfies the predicates selected by mode. For example, if mode is 5, then filename must be both readable and executable. If filename doesn't exist, or if it does not satisfy the selected predicates, #f is returned.

procedure+: file-modes filename
If filename names an existing file, file-modes returns an exact non-negative integer encoding the file's permissions. The encoding of this integer is operating-system dependent, but typically it contains bits that indicate what users and processes are allowed to read, write, or execute the file. If filename does not name an existing file, #f is returned.

procedure+: set-file-modes! filename modes
Filename must name an existing file. Modes must be an exact non-negative integer that could have been returned by a call to file-modes. set-file-modes! modifies the file's permissions to be those encoded by modes.

procedure+: file-modification-time filename
Returns the modification time of filename as an exact integer. The result may be compared to other file times using ordinary integer arithmetic. If filename names a file that does not exist, file-modification-time returns #f.

In operating systems that support symbolic links, if filename names a symbolic link, file-modification-time returns the modification time of the file linked to. An alternate procedure, file-modification-time-direct, returns the modification time of the link itself; in all other respects it is identical to file-modification-time. For symmetry, file-modification-time-indirect is a synonym of file-modification-time.

procedure+: file-access-time filename
Returns the access time of filename as an exact integer. The result may be compared to other file times using ordinary integer arithmetic. If filename names a file that does not exist, file-access-time returns #f.

Some operating systems don't implement access times; in those systems file-access-time returns an unspecified value.

In operating systems that support symbolic links, if filename names a symbolic link, file-access-time returns the access time of the file linked to. An alternate procedure, file-access-time-direct, returns the access time of the link itself; in all other respects it is identical to file-access-time. For symmetry, file-access-time-indirect is a synonym of file-access-time.

procedure+: set-file-times! filename access-time modification-time
Filename must name an existing file, while access-time and modification-time must be valid file times that might have been returned by file-access-time and file-modification-time, respectively. set-file-times! alters the access and modification times of the file specified by filename to the values given by access-time and modification-time, respectively. For convenience, either of the time arguments may be specified as #f; in this case the corresponding time is not changed. set-file-times! returns an unspecified value.

procedure+: file-attributes filename
This procedure determines if the file named filename exists, and returns information about it if so; if the file does not exist, it returns #f. The information returned is a vector of 10 items:

  1. The file type: #t if the file is a directory, a character string (the name linked to) if a symbolic link, or #f for all other types of file.
  2. The number of links to the file.
  3. The user id of the file's owner, an exact non-negative integer.
  4. The group id of the file's group, an exact non-negative integer.
  5. The last access time of the file, an exact non-negative integer.
  6. The last modification time of the file, an exact non-negative integer.
  7. The last change time of the file, an exact non-negative integer.
  8. The size of the file in bytes.
  9. The mode string of the file. This is a newly allocated string showing the file's mode bits.
  10. The inode number of the file, an exact non-negative integer.

In operating systems that support symbolic links, if filename names a symbolic link, file-attributes returns the attributes of the link itself. An alternate procedure, file-attributes-indirect, returns the attributes of the file linked to; in all other respects it is identical to file-attributes. For symmetry, file-attributes-direct is a synonym of file-attributes.

Directory Reader

procedure+: directory-read directory [sort?]
Directory must be an object that can be converted into a pathname by ->pathname. The directory specified by directory is read, and the contents of the directory is returned as a newly allocated list of absolute pathnames. The result is sorted according to the usual sorting conventions for directories, unless sort? is specified as #f. If directory has name, type, or version components, the returned list contains only those pathnames whose name, type, and version components match those of directory; wild or #f as one of these components means "match anything".


Go to the first, previous, next, last section, table of contents.