A string is a mutable sequence of characters. In the current
implementation of MIT Scheme, the elements of a string must all
satisfy the predicate char-ascii?
; if someone ports MIT
Scheme to a non-ASCII operating system this requirement will
change.
A string is written as a sequence of characters enclosed within double
quotes " "
. To include a double quote inside a string, precede
the double quote with a backslash \
(escape it), as in
"The word \"recursion\" has many meanings."
The printed representation of this string is
The word "recursion" has many meanings.
To include a backslash inside a string, precede it with another backslash; for example,
"Use #\\Control-q to quit."
The printed representation of this string is
Use #\Control-q to quit.
The effect of a backslash that doesn't precede a double quote or
backslash is unspecified in standard Scheme, but MIT Scheme
specifies the effect for three other characters: \t
, \n
,
and \f
. These escape sequences are respectively translated into
the following characters: #\tab
, #\newline
, and
#\page
. Finally, a backslash followed by exactly three octal
digits is translated into the character whose ASCII code is those
digits.
If a string literal is continued from one line to another, the string
will contain the newline character (#\newline
) at the line break.
Standard Scheme does not specify what appears in a string literal at a
line break.
The length of a string is the number of characters that it contains. This number is an exact non-negative integer that is established when the string is created (but see section Variable-Length Strings). Each character in a string has an index, which is a number that indicates the character's position in the string. The index of the first (leftmost) character in a string is 0, and the index of the last character is one less than the length of the string. The valid indexes of a string are the exact non-negative integers less than the length of the string.
A number of the string procedures operate on substrings. A substring is a segment of a string, which is specified by two integers start and end satisfying these relationships:
0 <= start <= end <= (string-length string)
Start is the index of the first character in the substring, and end is one greater than the index of the last character in the substring. Thus if start and end are equal, they refer to an empty substring, and if start is zero and end is the length of string, they refer to all of string.
Some of the procedures that operate on strings ignore the difference between uppercase and lowercase. The versions that ignore case include `-ci' (for "case insensitive") in their names.
char-ascii?
.
(make-string 10 #\x) => "xxxxxxxxxx"
char-ascii?
.
(string #\a) => "a" (string #\a #\b #\c) => "abc" (string #\a #\space #\b #\space #\c) => "a b c" (string) => ""
For compatibility with old code, char->string
is a synonym for
this procedure.
list->string
returns a newly allocated string formed from the
elements of char-list. This is equivalent to (apply string
char-list)
. The inverse of this operation is
string->list
.
(list->string '(#\a #\b)) => "ab" (string->list "Hello") => (#\H #\e #\l #\l #\o)
Note regarding variable-length strings: the maximum length of the result depends only on the length of string, not its maximum length. If you wish to copy a string and preserve its maximum length, do the following:
(define (string-copy-preserving-max-length string) (let ((length)) (dynamic-wind (lambda () (set! length (string-length string)) (set-string-length! string (string-maximum-length string))) (lambda () (string-copy string)) (lambda () (set-string-length! string length)))))
#t
if object is a string; otherwise returns
#f
.
(string? "Hi") => #t (string? 'Hi) => #f
(string-length "") => 0 (string-length "The length") => 10
#t
if string has zero length; otherwise returns
#f
.
(string-null? "") => #t (string-null? "Hi") => #f
(string-ref "Hello" 1) => #\e (string-ref "Hello" 5) error--> 5 not in correct range
char-ascii?
.
(define str "Dog") => unspecified (string-set! str 0 #\L) => unspecified str => "Log" (string-set! str 3 #\t) error--> 3 not in correct range
#t
if the two strings (substrings) are the same length
and contain the same characters in the same (relative) positions;
otherwise returns #f
. string-ci=?
and
substring-ci=?
don't distinguish uppercase and lowercase letters,
but string=?
and substring=?
do.
(string=? "PIE" "PIE") => #t (string=? "PIE" "pie") => #f (string-ci=? "PIE" "pie") => #t (substring=? "Alamo" 1 3 "cola" 2 4) => #t ; compares "la"
(string<? "cat" "dog") => #t (string<? "cat" "DOG") => #f (string-ci<? "cat" "DOG") => #t (string>? "catkin" "cat") => #t ; shorter is lesser
string-compare
distinguishes uppercase and lowercase letters;
string-compare-ci
does not.
(define (cheer) (display "Hooray!")) (define (boo) (display "Boo-hiss!")) (string-compare "a" "b" cheer (lambda() 'ignore) boo) -| Hooray! => unspecified
string-hash
returns an exact non-negative integer that can be used
for storing the specified string in a hash table. Equal strings
(in the sense of string=?
) return equal (=
) hash codes,
and non-equal but similar strings are usually mapped to distinct hash
codes.
string-hash-mod
is like string-hash
, except that it limits
the result to a particular range based on the exact non-negative integer
k. The following are equivalent:
(string-hash-mod string k) (modulo (string-hash string) k)
#t
if the first word in the string
(substring) is capitalized, and any subsequent words are either lower
case or capitalized. Otherwise, they return #f
. A word is
defined as a non-null contiguous sequence of alphabetic characters,
delimited by non-alphabetic characters or the limits of the string
(substring). A word is capitalized if its first letter is upper case
and all its remaining letters are lower case.
(map string-capitalized? '("" "A" "art" "Art" "ART")) => (#f #t #f #t #f)
#t
if all the letters in the string
(substring) are of the correct case, otherwise they return #f
.
The string (substring) must contain at least one letter or the
procedures return #f
.
(map string-upper-case? '("" "A" "art" "Art" "ART")) => (#f #t #f #f #t)
string-capitalize
returns a newly allocated copy of string
in which the first alphabetic character is uppercase and the remaining
alphabetic characters are lowercase. For example, "abcDEF"
becomes "Abcdef"
. string-capitalize!
is the destructive
version of string-capitalize
: it alters string and returns
an unspecified value. substring-capitalize!
destructively
capitalizes the specified part of string.
string-downcase
returns a newly allocated copy of string in
which all uppercase letters are changed to lowercase.
string-downcase!
is the destructive version of
string-downcase
: it alters string and returns an
unspecified value. substring-downcase!
destructively changes the
case of the specified part of string.
(define str "ABCDEFG") => unspecified (substring-downcase! str 3 5) => unspecified str => "ABCdeFG"
string-upcase
returns a newly allocated copy of string in
which all lowercase letters are changed to uppercase.
string-upcase!
is the destructive version of
string-upcase
: it alters string and returns an unspecified
value. substring-upcase!
destructively changes the case of the
specified part of string.
string-append
returns the empty
string (""
).
(string-append) => "" (string-append "*" "ace" "*") => "*ace*" (string-append "" "" "") => "" (eq? str (string-append str)) => #f ; newly allocated
(substring "" 0 0) => "" (substring "arduous" 2 5) => "duo" (substring "arduous" 2 8) error--> 8 not in correct range (define (string-copy s) (substring s 0 (string-length s)))
(define (string-head string end) (substring string 0 end))
(define (string-tail string start) (substring string start (string-length string))) (string-tail "uncommon" 2) => "common"
#\space
. If k is less than the
length of string, the resulting string is a truncated form of
string. string-pad-left
adds padding characters or
truncates from the beginning of the string (lowest indices), while
string-pad-right
does so at the end of the string (highest
indices).
(string-pad-left "hello" 4) => "ello" (string-pad-left "hello" 8) => " hello" (string-pad-left "hello" 8 #\*) => "***hello" (string-pad-right "hello" 4) => "hell" (string-pad-right "hello" 8) => "hello "
string-trim
) both ends of
string; (string-trim-left
) the beginning of string;
or (string-trim-right
) the end of string. Char-set
defaults to char-set:not-whitespace
.
(string-trim " in the end ") => "in the end" (string-trim " ") => "" (string-trim "100th" char-set:numeric) => "100" (string-trim-left "-.-+-=-" (char-set #\+)) => "+-=-" (string-trim "but (+ x y) is" (char-set #\( #\))) => "(+ x y)"
#f
if string does not contain pattern.
(substring? "rat" "pirate") => 2 (substring? "rat" "outrage") => #f (substring? "" any-string) => 0 (if (substring "moon" text) (process-lunar text) 'no-moon)
#f
if char does not appear in the
string. For the substring procedures, the index returned is relative to
the entire string, not just the substring. The -ci
procedures
don't distinguish uppercase and lowercase letters.
(string-find-next-char "Adam" #\A) => 0 (substring-find-next-char "Adam" 1 4 #\A) => #f (substring-find-next-char-ci "Adam" 1 4 #\A) => 2
#f
if none of the
characters in char-set occur in string.
For the substring procedure, only the substring is searched, but the
index returned is relative to the entire string, not just the substring.
(string-find-next-char-in-set my-string char-set:alphabetic) => start position of the first word in my-string ; Can be used as a predicate: (if (string-find-next-char-in-set my-string (char-set #\( #\) )) 'contains-parentheses 'no-parentheses)
#f
if char doesn't appear in the
string. For the substring procedures, the index returned is relative to
the entire string, not just the substring. The -ci
procedures
don't distinguish uppercase and lowercase letters.
-ci
procedures
don't distinguish uppercase and lowercase letters.
(string-match-forward "mirror" "micro") => 2 ; matches "mi" (string-match-forward "a" "b") => 0 ; no match
-ci
procedures don't distinguish uppercase and lowercase
letters.
(string-match-backward-ci "BULBOUS" "fractious") => 3 ; matches "ous"
#t
if the first string (substring) forms
the prefix of the second; otherwise returns #f
. The -ci
procedures don't distinguish uppercase and lowercase letters.
(string-prefix? "abc" "abcdef") => #t (string-prefix? "" any-string) => #t
#t
if the first string (substring) forms
the suffix of the second; otherwise returns #f
. The -ci
procedures don't distinguish uppercase and lowercase letters.
(string-suffix? "ous" "bulbous") => #t (string-suffix? "" any-string) => #t
string-replace
and
substring-replace
return a newly allocated string containing the
result. string-replace!
and substring-replace!
destructively modify string and return an unspecified value.
(define str "a few words") => unspecified (string-replace str #\space #\-) => "a-few-words" (substring-replace str 2 9 #\space #\-) => "a few-words" str => "a few words" (string-replace! str #\space #\-) => unspecified str => "a-few-words"
(define s (make-string 10 #\space)) => unspecified (substring-fill! s 2 8 #\*) => unspecified s => " ****** "
eqv?
):
substring-move-left!
substring-move-right!
The following example shows how these procedures can be used to build up
a string (it would have been easier to use string-append
):
(define answer (make-string 9 #\*)) => unspecified answer => "*********" (substring-move-left! "start" 0 5 answer 0) => unspecified answer => "start****" (substring-move-left! "-end" 0 4 answer 5) => unspecified answer => "start-end"
MIT Scheme allows the length of a string to be dynamically adjusted in a
limited way. This feature works as follows. When a new string is
allocated, by whatever method, it has a specific length. At the time of
allocation, it is also given a maximum length, which is guaranteed
to be at least as large as the string's length. (Sometimes the maximum
length will be slightly larger than the length, but it is a bad idea to
count on this. Programs should assume that the maximum length is the
same as the length at the time of the string's allocation.) After the
string is allocated, the operation set-string-length!
can be used
to alter the string's length to any value between 0 and the string's
maximum length, inclusive.
(<= (string-length string) (string-maximum-length string)) => #t
The maximum length of a string never changes.
set-string-length!
does not change the
maximum length of string.
MIT Scheme implements strings as packed vectors of 8-bit ASCII
bytes. Most of the string operations, such as string-ref
, coerce
these 8-bit codes into character objects. However, some lower-level
operations are made available for use.
(vector-8b-ref "abcde" 2) => 99 ; ascii for `c'
#f
if ascii does not appear. The index
returned is relative to the entire string, not just the substring.
Ascii must be a valid ASCII code.
vector-8b-find-next-char-ci
doesn't distinguish uppercase and
lowercase letters.
#f
if ascii does not appear. The index
returned is relative to the entire string, not just the substring.
Ascii must be a valid ASCII code.
vector-8b-find-previous-char-ci
doesn't distinguish uppercase and
lowercase letters.