Plaintext
What is "plaintext"? To me, in the year 2022, this means UTF-8 text files. As an English speaker, this usually means the ASCII subset of UTF-8. ASCII text files are probably the most universally compatible series of 1s and 0s you could hope to deal with on computing devices.
$ cat > foo.txt hello $ xxd foo.txt 00000000: 6865 6c6c 6f0a hello.
The only tricky part of this file is the invisible 0x0a
"Line Feed" (LF
,
\n
, "newline", etc.) character at the end. But as of 2018, even MS Notepad
knows what to do with a single LF
as opposed to the native Windows CRLF
(0x0d0a
).
There are no big-endian, little-endian byte ordering problems in ASCII because every character in ASCII is contained in 7 bits. And unlike UTF-16 and other mistakes, there are no endianness issues in UTF-8 because even for multi-byte characters, the individual units are still single bytes!.
$ cat > foo.txt £ $ xxd foo.txt 00000000: c2a3 0a ...
(In the above, the c
part of the byte c2
is 1100
, which tells us that
the pound sign character will have an additional byte, which happens to be a3
.)
I agree with this manifesto: http://utf8everywhere.org/ (Which also happens to be a very helpful resource for understanding Unicode in general!)
See also text-editing and text-markup