graphein: TEI Usage

1. TEI Usage

collections of texts (like web home pages) vs individual texts which have contents-and-chapters

 
"abstract" 

In earlier versions of lemur.com and CircuitousRoot.com, 
I'd had a style where I had a sort of a list of links to sub-pages. 
Each item in the list would be fairly large; 
it'd have a biggish icon/image on the left, 
a title (of the sub-document, e.g.) to the upper right, 
and optionally some explanatory text below the title 
(which if it got long enough could actually wrap below the image, 
making it look a bit like the big initial letter in a paragraph in a 
medieval manuscript). 
I used this style for individual images (as their main entry, 
linking to larger versions) as well. 

The trouble is, this wasn't really a list. 
The "items" were too big, and it didn't fit the list semantics of 
either the TEI or HTML. 
It was also possible to imagine this construct appearing in 
non-list forms; e.g., in a table or two dimensional array. 

Moreover, in some cases I used such a "list" to go to chapters 
of a document, 
while in others a regular list of chapters (not formatted up in this way) 
seemed better. 

What I finally decided was that the semantics of what I had 
resembled those of an "abstract" - a sort of a brief summary 
of a particular topic. 
In my usage, such a summary would probably also include its title 
and an image. 

The critical thing, then, was to identify this as a unit. 
I selected the "ab" TEI tag, 
which is like a paragraph but has no paragraph semantics. 
(It's a sort of generic block-level element.) 

Having so identified the unit, 
such a unit, when it appears, may be set off from the rest of 
the document in any way the designer of the formatting scripts desires: 
not at all, by spaces, with a border, etc. 
This would be specified in the XSLT+CSS (in the CSS, really) for HTML, 
in the XSL+FO for PDF, etc. 
(but not in the TEI encoding itself). 

Within such an "abstract" block, 
anything at all may be encoded using other TEI encoding. 
For example, I can include an image (anywhere, not just at the beginning) 
using the "graphic" TEI tag 
(probably without any "figure" tag around it). 
I can encode a title by highlighting some text appropriately. 
I can wrap the whole thing in a "ref" so that clicking anywhere links, 
or I can set up any individual part 
(or no part) as a link. 

In my practice, when using this "abstract" to link to the index 
document of a subdirectory, 
I often use an image/icon in the abstract 
which is a symbolic link to the mastered "category" image for 
this subdirectory. 

cd images-mastered-for-scaling 
ln -s ../SUBDIR/IMAGE-sfN.png IMAGE-sfN.png 


LEGAL SECTION 

The idea is to have a regular order of topics: 
in "availability" section of TEI Header 
each in an ab 
1. Identification of all PD material 
ab type="legal-item-pd" 
2. Identification of non-PD but fairly or freely used material 
ab type=legal-item-fairuse 
ab type=legal-item-withpermission 
ab type="legal-item-ccbysa" 
ab type="legal-item-gfdl" 
ab type="legal-item-gfdl-ccbysa" 
ab type=[add these as I encounter them] 
3. Copyright and License 
ab type="copyright" 
[user-supplied text, such as:] DATE by OWNER 
4. License 
ab type="legal-license-c" 
ab type="legal-license-ccbysa" 
ab type="legal-license-gfdl" 
ab type="legal-license-gfdl-ccbysa" 

4. DMM/RK service/tm recognition 
ab type="legal-tm" 
CircuitousRoot & circuitousroot.com are service marks of 
David M. MacMillan. 
5. other TM recognition (spell out in text) 
ab type="legal-tm" 
6. "Presented originally by" line 
ab type="legal-presented-originally" 
auto gen: "Presented originally by " 
supply content: ref target... Circ Root ... 


NOTES 

SYSTEM entity declarations must occur on a single line 

public-vs-private 
can define attribute n='private' for only these entities: 
ab (not yet done) 
p (done) 
item (done) 

may add certain types of div later (chapter, section, broadside, 
broadside-divider ?) 

default build is "private" (= show everything) 
on a private build, show private ab/p/item shaded 
on a public build, omit them 
to override the build, do: make P=public 
to do a full public build: 
svn clone the tree 
make clean 
(NOTE: must "make clean" or public make will only hit changed source) 
make P=public 
ftp directories by hand to public location 
(NOTE: even private directories will build!) 

div An ordinary TEI "div" without any attributes works. 
The XSL ignores it. 
div type="broadside" 
The name "broadside" probably isn't good. 
I'm using this for simply collecting together things other 
than chapters, sections, and running text 
- for example, sequences of links to pages. 
Right now, the XSL just generates an HTML div of class="broadside" 
and the CSS clears it both left and right. 
div type="broadside-divider" 
This is intended to be a separator between "broadsides" and 
other material. 
The XSL generates an HTML div of class="broadside-divider" 
The CSS clears it left and right and then centers whatever 
is inside it. 
I often just put an image (TEI: graphic) inside it. 
In particular, if my TEI graphic specifies a scalable image, 
the image scaling process can give me a nice sort of icon. 
One convention here is to call this icon "broadside-divider-sfN.png" 
in images-mastered-for-scaling, but that is just a convention 
(not forced by the TEI, XSL, or CSS) 

div type="chapter" 
section 
SHOULD EACH PAGE BE A CHAPTER? 
NO - only when the "feel" like a new chapter 
otherwise it just adds another level of N. numbering to the headers 


general page/document conventions 
the title and subtitle are from the teiHeader 
on the index.tei page, often it's nice to use a 
div type="broadside-divider" below that, 
with the link-topic image in it 
to make it quite clear where we've gone to 

code is defined in TEI P5 Chapter 27 "Documentation Elements" 
ab ("arbitrary block"?) is defined in TEI P5 Chapter 14 
"Linking, Sefmentation, and Aligmnent," 
section 14.3 "Blocks, Segments, and Anchors" 
type= 
type="code" block-level code example 
for phrase-level code, use "code" 

code for phrase-level code 
for block-level code, use 'ab type="code"' 
lang=awk 
bash 
make 

note for all notes - footnotes, margin notes / sidebars, biblio 
place=foot footnotes 
margin small notes to sidebars 
block a block element inline in the text flow 
e.g., for annotations in a bibliography 

distinct for highlighting when there is semantic knowledge of 
the thing highlighted (vs hi for something visually 
distinct for reasons unknown) 
type="filename" 
type="variable" a user-definable element of a programming language, 
including variables proper, function names, etc. 
="function" synonym for "variable" 
type="input" user input, commands-as-typed 
type="output" program or system output 
type="keyword" a reserved keyword in a programming language 
type="program" a program or command 
when used as if it might be typed 
but not when it is simply the name of the program 
(this distinction isn't always clear) 

hi rend= for highlighting when there is no semantic knowledge of the 
thing highlighted 
rend=italic 

figure 
head use for name 
figDesc use for alternative text description 
graphic 
width use only for non-scalable bitmaps 
height use only for non-scalable bitmaps 
scale TEI defines this as a probability between 0 and 1. odd. 
so circumvent by prefixing a decimal point to the 
graphein scale factor: 0.1, 0.2, ... 0.4 
if not present, not scalable 
url omit suffix 
omit SF if scalable 
if scalable, XSL will add SF for each 
order of search: PNG then JPG 
if SVG exists, special link to it 
(for now don't use by default as many viewers can't 
do SVG; in future probably reverse this behavior and 
default to SVG, keeping PNG/JPG as alternative) 

chemical formulae 
use "formula" tag to mark off entire formula 
rather than reference another formula markup DTD, or non-XML system, 
just use Unicode to encode the formula. 
The Unicode superscript/subscript range is: 
superscript: 2070, 00B9, 00B2, 00B3, 2074-2079 
superscript parens: 207D, 207E 
superscript +/-: 207A, 207B 

subscript: 2080-2089 
subscript parens and +/- exist too. 

middle dot: 00B7 
(note that the superscript range takes some from the C1/Latin-1 range, 
too) 
remember: in vi, entering 4-digit Unicode is ctrl-v u XXXX 
At this time (2007), Chemical Markup Language seems ill-defined 
and poorly supported, so I'll just go with this simpler solution. 
I may regret this later. 


in each directory, if any of the documents in it have images 
(other than the standard linking icons) 
or in the index document if any of the standard 
linking icons are new in the directory, 
have a single TEI-encoded document called "about-the-images.tei" 
which describes them. 

In each document which has such images, put the following 
div in the back matter: 

div 
ab type=about-the-images-link" 
div 

 
Use Unicode, not HTML entities. 
Encode as Unicode (UTF-8), not numeric entity references. 


For trademark, registered trademark, and copyright symbols, 
use Unicode. 
Copyright symbol: U+00A9 (U+0080 - U+00FF: C1 Controls and Latin-1 Supplement) 
Registered Trademark Symbol: U+00AE (U+0080 - U+00FF: C1 Controls and Latin-1 Supplement) 
TM symbol: U+2122 (U+2100 - U+214F: Letterlike Symbols)