The following character classes exist:
Character Class | Notation Used | Valid Characters |
upper_case | UC | all upper case letters |
underline | UL | _ |
lower_case | LC | all lower case letters |
digit | N | digits |
blank_space | BS | space, tab and nonprintable ASCII characters |
end_of_line | NL | carriage return and line feed |
atom_quote | AQ | ' |
string_quote | SQ | " |
list_quote | LQ | |
radix | RA | |
ascii | AS | |
solo | SL | ( ) ] } |
special | SP | ! , ; [ { | |
line_comment | CM | % |
escape | ESC | \ |
first_comment | CM1 | / |
second_comment | CM2 | * |
symbol | SY | # + - . : < = > ? @ ^ ` ~ $ & |
The character class of any character can be modified by the built-in predicate set_chtab/2. Tokens can be read with the predicate read_token/3.
Group Type | Notation | Valid Characters |
alphanumerical | ALP | UC UL LC N |
delimiter | DE | ) } ] , | |
any character | ANY | |
non escape or newline | NEN | any character except escape and newline |
sign | SGN | + - |
The valid tokens are described below :
ATOM = (LC ALP*) | (SY | CM1 | CM2 | ESC)+ | (AQ (NEN | ESC ANY)* AQ) | | | ; | [] | {} | !
INT = [SGN] BS* N+
INTBAS = N+ (AQ | RA) (N | LC | UC)+The base must be an integer between 1 and 36 included, the value being valid for this base.
ASCII = 0 (AQ | RA) ANY | AS ANYThe value of the integer is the ASCII code of the last character.
RAT = [SGN] BS* N+ UL N+
REAL = [SGN] BS* N+ . N+ [ (e | E) [SGN] N+ | Inf ] | [SGN] BS* N+ (e | E) [SGN] N+checks are performed that the numbers are in a valid range.
STRING = SQ (NEN | ESC ANY | SQ BS* SQ)* SQ
LIST = LQ (NEN | ESC ANY)* LQ
VAR = (UC | UL) ALP*
EOCL = . (BS | NL | <end of file>) | <end of file>
Within atoms and strings, the escape sequences (ESC ANY) are interpreted : if
ANY is one of the characters described in the table below, the sequence ESC
ANY generates a special character. Otherwise, the lexical analyser just ignores
the escape character (ie "\a"
is the same as "a"
).
Escape Character | Result |
b |
backspace |
f |
line feed |
n |
newline |
r |
carriage return |
t |
tabulation |
newline | ignored |
three octal digits | character whose ASCII code is the octal value (any 8-bit value) |