graphein: Character Encoding and Filename Suffixes

1. Character Encoding and Filename Suffixes

The underlying character mapping is Unicode/ISO10646, in some undefined but sufficiently modern version. The encoding of this mapping is UTF-8, which is backward compatible with 7-bit ASCII.

Filename suffixes are not necessary in Linux systems, but they are handy both for automatic processing by the make utility and for human recognition.

In theory, there exist three major categories of possible types of filename suffixes:

A. Those indicating character encoding, e.g:

B. Those indicating content meta-markup, e.g.:

C. Those indicating markup language, e.g.:

Since everything here is UTF-8 encoded Unicode/ISO10646 (including the backwards-compatible 7-bit ASCII encoding), suffixes which indicate character encoding are not useful (they'd all be something like " .utf8").

Since all meta-encoding here is XML, and since it is most useful to distinguish between various XML-defined markup languages (e.g., TEI, HTML), suffixes which indicate content meta-markup are not useful.

Filename suffixes here will, therefore, indicate the content markup language (or dominant content markup language, if two are used (for example, via XML Namespaces)). The suffixes used are:

1. For source files (delete these and lose work):

2. Several of the file types noted above may be either handwritten or defined in gMLP (and thus generated). If the latter, but only if the latter, they may be deleted at will. Be careful. The following types, listed also above, are often defined in gMLP:

3. For generated intermediate files (may be deleted at will):

Note however that VARKON .MBS files, HTML files, and PDF files imported from other sources may be original documents and would be lost if deleted.

4, 5. Exceptions:

4. The following filenames (of source files; do not delete) by convention (sometimes a general convention, sometimes just my own) do not take suffixes:

5. Directory names do not take suffixes.