Token Separator (programming languages)
Topic: Programming Language Design
Which line-ending or other token-separating characters do you allow in your programming language implementation?
These days, even Windows is cool with \n
as a line-ending sequence. So that’s
a pretty safe bet. (Whether you accept others is up to you!) But what about
spaces versus tabs and far more exotic things like vertical tabs, etc?
Just ran across this great question-and-answer on Reddit:
And I really like this answer:
ummwut: I kinda cheat and just identify whitespace characters as all ASCII values less than 33. (Unicode is a project for later.)…
followed by:
kenorep: It’s allowed: "If the delimiter is the space character, hex 20 (BL), control characters may be treated as delimiters"
Followed by link to the Forth standard subsection 3.4.1.1 (forth-standard.org)
The Zig programming language, while still in its pre-1.0 infancy, controversially supports only \n
for line
endings and spaces instead of tabs for indenting.
and makes any other characters are a compiler error!