MEP home . Search

Model Editions Partnership

TEI/MEP Encoding Scheme

David Chesnutt

C. M. Sperberg-McQueen

Susan Hockey

Document MEP W04

Version 4.0

16 January 2002



This document describes some aspects of the Model Editions Partnership (MEP) encoding scheme and defines the SGML elements added by MEP to the standard SGML tag set defined by the Text Encoding Initiative (TEI). It includes formal declarations for the MEP/TEI encoding scheme, and is intended for use by XML experts and technically minded readers. For reference documentation for the MEP modifications to TEI, see the MEP tag-set documentation.
The first section discusses the overall organization of the MEP encoding scheme and the types of extensions and modifications the MEP scheme makes vis a vis the basic TEI encoding scheme. The next sections describe in detail various changes to the TEI Guidelines: The final sections include descriptions of fixes and modifications taken over from TEI Lite, and the selection of individual tags from the selected TEI tag sets.
This version of this document reflects work done by DRC and MSM in January, 2002, modifying the MEP DTD to be XML-compliant. Like earlier versions, it also reflects discussions between DRC and MSM in Oak Park, February 1996, and later work by MSM in March and April, 1996. It is not complete; some obvious lacunae are identified by comments in brackets [like this]; others equally obvious and painful are passed over in silence.
[It probably also needs some reorganization or movement of material from one section to another.]

1. Overall Organization of the MEP DTD

The MEP encoding scheme is an SGML document type definition (DTD) based on that defined by the TEI. It selects some of the base and additional tag sets defined by the TEI; within those tag sets, it suppresses some elements not used by MEP editions; in addition, it defines a number of specialized elements for historical editions not included in the TEI Guidelines. The MEP encoding scheme is a strictly conformant application of the TEI Guidelines, defined by this document and by the standard reference documentation, which takes the form of a TEI tag set document (TSD).
This document provides the formal definitions for MEP's extensions and modifications of the TEI DTD; the TEI Guidelines provide the formal definitions for the rest of the MEP DTD. Although this document provides some examples and descriptions of the intended usage of MEP extensions, it is intended not as end-user documentation but as a technical reference manual for the formal definition of the tag set. End-user documentation is being prepared separately.

1.1. Tag Sets Selected

MEP selects the following TEI tag sets. For descriptions of them, see the TEI Guidelines.
  • base tag set for prose
  • additional tag set for linking, segmentation, and alignment
  • additional tag set for simple analysis
  • additional tag set for transcription of manuscripts
  • additional tag set for tables, graphics, and figures
Note that the tag sets for certainty and responsibility, detailed tagging of names and dates, and text-critical apparatus are not selected, in the current version of the MEP DTD. Users who need those tag sets should make their own subset of the TEI DTD.

1.2. Data Capture and Archival DTDs

The MEP encoding scheme defines several distinct DTDs: one for the archival form of MEP documents, and several (labeled level 1, level 2, and level 3) for data capture. The basic differences among them are that the data capture DTDs define a flatter document structure, with increasingly detailed and informative tagging as one progresses from level 1 to level 3, while the archival DTD is more explicit in grouping together things like the date line and salutation at the beginning or end of a letter, and matches TEI P3 more closely.
It is intended that documents be keyboarded or translated from legacy formats into a ‘level 0’ DTD, which may be different for each edition, and then translated mechanically (or with as little human intervention as possible) into the level 1 data capture DTD. In the level 1 DTD, all typographic blocks and character-level font styles are captured, so that the documents can be rendered on the screen or page in an appropriate way. In level 2, hyperlinks are added for notes and cross references. In level 3, additional markup is added for typographically indistinct, but intellectually important, phenomena like names, dates, etc. From the level 3 DTD the data can be translated automatically into the archival form. The translation from data capture form to archival form will be performed with simple parsers written with yacc and lex or similar tools.[1] The parser for the translation will be documented in a separate paper.

1.3. Invocation

Normally, MEP documents will be composed of more than one file. One file, the driver file, will contain the outermost layer of the document, including the SGML document type declaration and the TEI header for the document (normally a sample from one or more editions) as a whole. It will also contain the start- and end-tags for the <text> element, the front and back matter, and the start- and end-tags for the <docGroup> element. Within the <docGroup> element, it will contain a series of entity references to the individual files containing the individual documents of the collection.
Note (1997-02-06): when it is desirable to treat each historical document as a separate SGML document, the <doc>, <document>, or <surrogate> element may and should be used as the root of the free-standing document. The rest of this documentation needs to be revised accordingly.
The invocation of the MEP DTD can be done in either of two ways. The TEI Guidelines suggest an invocation something like this:

<!DOCTYPE TEI.2 PUBLIC "-//TEI P3//DTD Main Document Type 1994-05//EN" [
  <!ENTITY % TEI.extensions.ent
  PUBLIC     "-//MEP//DTD Model Editions Partnership TEI entities
             ver. &dtd.version;//EN"
             "mepext.ent" >

  <!ENTITY % TEI.extensions.dtd
  PUBLIC     "-//MEP//DTD Model Editions Partnership
             TEI dtd mods ver. &dtd.version;//EN"
             "mepext.dtd" >

<!ENTITY % TEI.prose    'INCLUDE' >
<!ENTITY % TEI.linking  'INCLUDE' >
<!ENTITY % TEI.analysis 'INCLUDE' >
<!ENTITY % TEI.transcr  'INCLUDE' >
<!ENTITY % TEI.figures  'INCLUDE' >

<!ENTITY % TEI.certainty   'IGNORE' >
<!ENTITY % TEI.names.dates 'IGNORE' >
<!ENTITY % TEI.textcrit    'IGNORE' >
]>

A second possibility (and simpler, for most users) is to refer, in the document type declaration, not to the basic TEI DTD, but to the MEP DTD driver file:

<!DOCTYPE TEI.2  PUBLIC "-//MEP//DTD Model Editions Partnership
                        TEI tag set ver. 1//EN" >

The driver file itself is very simple: it includes the entity declarations shown in the previous example, followed by an entity declaration and entity reference which cause the basic TEI DTD to be embedded. For the full text of the driver file, see section MEP Driver Files, below.
For example, here is a portion of a sample driver file for part of the Papers of Henry Laurens. First, we have the document type declaration proper, giving the public identifier of the MEP DTD driver file, and containing entity declarations for the writing system declaration and the various individual documents:

<!DOCTYPE TEI.2  PUBLIC "-//MEP//DTD Model Editions Partnership
                        TEI tag set ver. 1//EN"
                        "meptei.dtd" [
<!ENTITY english.wsd SYSTEM 'teien.wsd' SUBDOC>
<!-- ... -->

<!ENTITY hl553 PUBLIC "-//MEP//TEXT Henry Laurens 553//EN"
                      "c:\mep\samples\hl\hl553.sgm" >
<!ENTITY hl554 PUBLIC "-//MEP//TEXT Henry Laurens 554//EN"
                      "c:\mep\samples\hl\hl554.sgm" >
<!ENTITY hl555 PUBLIC "-//MEP//TEXT Henry Laurens 555//EN"
                      "c:\mep\samples\hl\hl555.sgm" >
<!-- ... -->
]>

Next, we have the start-tag for the main document element <tei.2>, followed by the TEI header for the sample as a whole:

<tei.2>
<teiHeader>
<fileDesc>
<titleStmt><title>
Test document for MEP corpus struture.
</title>
</titleStmt>
<publicationStmt><p>
An unpublished document.
</p></publicationStmt>
<sourceDesc><bibl>
<title>The Papers of Henry Laurens</title>, ed. George C. Rogers, Jr.,
David Chesnutt, et al.
(Columbia, S.C.:  University of South Carolina Press, 19nn- ).
</bibl></sourceDesc>
</fileDesc>
<profileDesc>
  <langUsage>
    <language id="eng" wsd="english.wsd" usage='100'></language>
  </langUsage>
</profileDesc>
</teiHeader>

Finally the <text> element appears, containing front matter and the document collection itself, tagged as a <docGroup>. The individual documents are referred to one after the other; each is tagged as a single <document> element, containing a TEI header, editorial front matter, the document transcription, and editorial back matter. Subgroups and series of related documents with their own front or back matter might be enclosed within separate files, or have their editorial matter included here; the convenience of the encoder is the deciding factor.

<text>
<front><head>Sample from the Papers of Henry Laurens</head>
</front>
<docGroup>
&hl553
&hl554
&hl555
<!-- ... -->
</docGroup>
</text>
</tei.2>

For network distribution, it may be desirable to encode each document in a sample as a separate SGML document, and replace the normal driver file with a hypertext web or hub file. In this case, a simple driver can be provided for each document in the sample, and the entity references in the driver file can be replaced by hypertext links.
[Further discussion and examples are needed.]

1.4. MEP Extension Files

The MEP DTD is defined as a set of modifications of the TEI DTD, and is contained in two files, in the manner described in the TEI Guidelines. The first file, the entities file, makes certain modifications to the TEI element-class system, by redefining various SGML parameter entities. The second file, the element declarations file, includes declarations for all MEP extensions to the main TEI DTD. There are two sets of these files, one for the data-capture DTD and one for the archival DTD.
This section also defines the driver files provided to simplify the use of the MEP TEI tag set.

1.4.1. Entities Files

1.4.1.1. Archival Form
The entities file for the archival form of the DTD is called mep.ent. It has the following structure:
< 1 mep.ent: MEP extensions.ent file (archival form) [File mep.ent] > ≡
<!-- ******************************************************** -->
<!-- * mep.ent:  TEI.extensions.ent file for the            * -->
<!-- * Model Editions Partnership Archival-Form DTD         * -->
<!-- *                                                      * -->
<!-- * Version 3.0, 2002-01-16                              * -->
<!-- *                                                      * -->
<!-- * This file shows how MEP TEI is derived from          * -->
<!-- * the TEI main DTD.                                    * -->
<!-- *                                                      * -->
<!-- * Do NOT modify this DTD file directly!                * -->
<!-- * Modify document MEP W04 and regenerate the DTD!      * -->
<!-- ******************************************************** -->

<!-- Revisions:                                               -->
<!-- 2002-01-16 : CMSMcQ, DRC : version 3.0                   -->
<!--                       * make the DTD XML-compatible      -->
<!--                       * suppress inclusion exceptions    -->
<!--                       * modify all element declarations  -->
<!--                         to include m.Incl (to get the    -->
<!--                         inclusions back in)              -->
<!--                       * extend TEI refsys class with     -->
<!--                         plink to get it into m.Incl      -->
<!-- 1998-10-19 : CMSMcQ, DRC : version 2.0                   -->
<!-- 1996-10-10 : CMSMcQ : finish the tweaks: dateline,       -->
<!--                       closing                            -->
<!-- 1996-09-25 : CMSMcQ : tweaks:  idno, dateline, closing   -->
<!-- 1996-04-18 : CMSMcQ : copy into MEP W04 document         -->
<!-- 1996-02-20 : CMSMcQ : correct against more current lite  -->
<!-- 1996-02-19 : CMSMcQ : copy teilite into meptei           -->
<!-- 1995-01-21 : CMSMcQ : first cut at selection             -->
<!-- 1995-01-21 : CMSMcQ : made file from skelmods.ent        -->

{Preliminaries for TEI.extensions.ent file (archival and data capture forms) 60}

{Renaming of TEI elements (archival and data-capture forms) 30}

{Define new element classes (archival and data-capture forms) 37}

{Modify existing element classes (archival and data capture forms) 54}

{Select TEI elements (archival and data capture forms) 61}


<!-- ******************************************************** -->
<!-- End of file mep.ent ************************************ -->
<!-- ******************************************************** -->



1.4.1.2. Data-capture form
The entity file for the data-capture form of the DTD is called mepdc.ent. It is used for all three levels of the data-capture DTD. It has the following structure:
< 2 mepdc.ent: MEP extensions.ent file [File mepdc.ent] > ≡
<!-- ******************************************************** -->
<!-- * mepdc.ent:   TEI.extensions.ent file for MEP DC DTD  * -->
<!-- *                                                      * -->
<!-- * Version 3.0, 2002-01-16                              * -->
<!-- *                                                      * -->
<!-- ******************************************************** -->
<!-- Last revisions:                                          -->
<!-- 2002-01-16 : CMSMcQ, DRC : version 3.0                   -->
<!--                       * make the DTD XML-compatible      -->
<!--                       * suppress inclusion exceptions    -->
<!--                       * modify all element declarations  -->
<!--                         to include m.Incl (to get the    -->
<!--                         inclusions back in)              -->
<!--                       * extend TEI refsys class with     -->
<!--                         plink to get it into m.Incl      -->
<!-- 1998-10-19 : CMSMcQ, DRC : Divide DC into levels.        -->

<!-- ******************************************************** -->
<!-- * This file shows how the MEP Data Capture DTD is      * -->
<!-- * is derived from the TEI main DTD.                    * -->
<!-- *                                                      * -->
<!-- * Do NOT modify this DTD file directly!                * -->
<!-- * Modify document MEP W04 and regenerate the DTD!      * -->
<!-- ******************************************************** -->

{Preliminaries for TEI.extensions.ent file (archival and data capture forms) 60}

{Renaming of TEI elements (archival and data-capture forms) 30}

{Define new element classes (archival and data-capture forms) 37}

{Classes for Targets 45}

{Header elements (data capture form) 33}

{Highlighting class (data capture form) 56}

{Modify existing element classes (archival and data capture forms) 54}

{Select TEI elements (archival and data capture forms) 61}


<!-- ******************************************************** -->
<!-- * End of file mepdc.ent:  MEP data-capture DTD         * -->
<!-- ******************************************************** -->



1.4.2. Element Declarations Files

1.4.2.1. Archival form
The declarations file for the archival form of the DTD is called mep.dtd. It has the following structure:
< 3 mep.dtd: MEP extensions.dtd file (archival form) [File mep.dtd] > ≡
<!-- ******************************************************** -->
<!-- * mep.dtd:  TEI.extensions.dtd file for the            * -->
<!-- * Model Editions Partnership Archival-Form DTD         * -->
<!-- *                                                      * -->
<!-- * Version 3.0, 2002-01-16                              * -->
<!-- *                                                      * -->
<!-- * This file contains element declarations for elements * -->
<!-- * not in the TEI main DTD, or defined differently by   * -->
<!-- * MEP.                                                 * -->
<!-- *                                                      * -->
<!-- * Do NOT modify this DTD file directly!                * -->
<!-- * Modify document MEP W04 and regenerate the DTD!      * -->
<!-- ******************************************************** -->

<!-- Last revisions:                                          -->
<!-- 2002-01-16 : CMSMcQ, DRC : version 3.0                   -->
<!--                       * make the DTD XML-compatible      -->
<!--                       * suppress inclusion exceptions    -->
<!--                       * modify all element declarations  -->
<!--                         to include m.Incl (to get the    -->
<!--                         inclusions back in)              -->
<!--                       * extend TEI refsys class with     -->
<!--                         plink to get it into m.Incl      -->
<!-- 1998-10-19 : CMSMcQ, DC : specify levels explicitly      -->

<!-- Overall text structure -->
{Document Groups (archival form) 31}

{Enclosures (archival form) 35}

{Document (archival form) 39}

{Document surrogates (archival form) 41}

{Editorial front and back matter (archival and data-capture forms) 38}


<!-- Chunk-level elements  -->
{Specialized components (archival and data-capture forms) 46}

{Specialized notes (archival and data capture forms) 47}

{Redefine paragraph and seg 50}

{Specialized Elements for Indices 51}


<!-- Phrase-level elements  -->
{Phrase-level elements for sender, etc. (archival, data capture) 57}

{Miscellaneous phrases (archival and data-capture forms) 53}


<!-- Fixes to TEI P3 -->
{Bibliographic references (archival, data capture) 59}





1.4.2.2. Data-capture form
The declarations file for the data-capture form of the DTD is called mepdc.dtd. It is used for all three levels of the data-capture DTD and has the following structure:
< 4 mepdc.dtd: MEP extensions.dtd file (data-capture form) [File mepdc.dtd] > ≡
<!-- ******************************************************** -->
<!-- * mepdc.dtd:  TEI.extensions.dtd file for the          * -->
<!-- * Model Editions Partnership Data Capture DTD          * -->
<!-- *                                                      * -->
<!-- * Version 3.0, 2002-01-16                              * -->
<!-- *                                                      * -->
<!-- * This file contains element declarations for elements * -->
<!-- * not in the TEI main DTD, or defined differently by   * -->
<!-- * MEP.                                                 * -->
<!-- *                                                      * -->
<!-- * Do NOT modify this DTD file directly!                * -->
<!-- * Modify document MEP W04 and regenerate the DTD!      * -->
<!-- ******************************************************** -->

<!-- Revisions:                                               -->
<!-- 2002-01-16 : CMSMcQ, DRC : version 3.0                   -->
<!--                       * make the DTD XML-compatible      -->
<!--                       * suppress inclusion exceptions    -->
<!--                       * modify all element declarations  -->
<!--                         to include m.Incl (to get the    -->
<!--                         inclusions back in)              -->
<!--                       * extend TEI refsys class with     -->
<!--                         plink to get it into m.Incl      -->
<!-- 1998-10-19 : CMSMcQ, DRC : version 2.0                   -->
<!-- 1998-03-26 : CMSMcQ, DC : declare p, docketing, etc.     -->
<!-- 1997-05-30 : CMSMcQ : pb, spkr, date, target changes     -->
<!-- 1996-10-10 : CMSMcQ : miscellaneous maintenance          -->

<!-- Overall text structure -->
{Document Groups (data capture form) 32}

{Enclosures (data capture form) 36}

{Document (data capture form) 40}

{Elements for document header (data-capture form) 34}

{Document surrogates (data capture form) 42}

{Elements for MSP targets 43}

{Editorial front and back matter (archival and data-capture forms) 38}


<!-- Chunk-level elements  -->
{Specialized components (archival and data-capture forms) 46}

{Specialized notes (archival and data capture forms) 47}

{Redefine paragraph and seg 50}

{Specialized Elements for Indices 51}


<!-- Phrase-level elements  -->
{Phrase-level elements for sender, etc. (archival, data capture) 57}

{Typographic highlighting (data-capture) 55}

{Miscellaneous phrases (archival and data-capture forms) 53}

{Define page-link element 58}


<!-- Fixes to TEI P3 -->
{Bibliographic references (archival, data capture) 59}





1.4.3. MEP Driver Files

1.4.3.1. Archival form
The driver file for the MEP archival-form DTD is called mep-drv.dtd; it has the following contents:
< 5 mep-drv.dtd: MEP driver file (archival-form) [File mep-drv.dtd] > ≡
<!-- mep-drv.dtd:  MEP driver file for archival-form DTD -->
<!-- Version:  3.0 --> 
<!-- Date:  2002-01-16 --> 

<!-- MEPTEI:  this is the archival version of the MEP tag set -->
<!-- it may be invoked as
     PUBLIC "-//MEP//DTD Model Editions Partnership
                     TEI tag set ver. 3.0//EN"
-->

{mep-drv.noents.dtd:  MEP driver file (archival-form), no entities 6}

{ISO entity-set declarations (SGML) 25}




Everything but the entities is included in the no-entities file (which is an artefact of our method of putting together the final single-file DTDs).
< 6 mep-drv.noents.dtd: MEP driver file (archival-form), no entities [File mep-drv.noents.dtd] > ≡
{Revision history 21}


<!ENTITY % L1 'INCLUDE' >
<!ENTITY % L2 'INCLUDE' >
<!ENTITY % L3 'INCLUDE' >
{Entity declarations for SGML drivers 28}


<!ENTITY % TEI.extensions.ent
  PUBLIC "-//MEP//DTD Model Editions Partnership TEI entities
         ver. 3.0//EN"
         "mep.ent" >

<!ENTITY % TEI.extensions.dtd
  PUBLIC "-//MEP//DTD Model Editions Partnership
         TEI dtd mods ver. 3.0//EN"
         "mep.dtd" >

{Selection of TEI tag sets 23}

{Notation declarations 27}


This code is used in < mep-drv.dtd: MEP driver file (archival-form) 5 > < mep-drvx.dtd: MEP driver file (XML archival-form) 7 >

The XML form is virtually identical:
< 7 mep-drvx.dtd: MEP driver file (XML archival-form) [File mep-drvx.dtd] > ≡
<!-- mep-drvx.dtd:  MEP driver file for archival-form DTD -->
<!-- Version:  3.0 --> 
<!-- Date:  2002-01-16 --> 

<!-- MEPTEI:  this is the archival version of the MEP tag set -->
<!-- it may be invoked as
     PUBLIC "-//MEP//DTD Model Editions Partnership
                     TEI tag set ver. 3.0 (XML)//EN"
-->

{Entity declarations for XML drivers 29}

{mep-drv.noents.dtd:  MEP driver file (archival-form), no entities 6}

{ISO entity-set declarations (XML) 26}




and the no-ents version:
< 8 mep-xdrv.noents.dtd: MEP driver file (archival-form), no entities [File mep-xdrv.noents.dtd] > ≡
{Revision history 21}


<!ENTITY % L1 'INCLUDE' >
<!ENTITY % L2 'INCLUDE' >
<!ENTITY % L3 'INCLUDE' >
{Entity declarations for XML drivers 29}


<!ENTITY % TEI.extensions.ent
  PUBLIC "-//MEP//DTD Model Editions Partnership TEI entities
         ver. 3.0//EN"
         "mep.ent" >

<!ENTITY % TEI.extensions.dtd
  PUBLIC "-//MEP//DTD Model Editions Partnership
         TEI dtd mods ver. 3.0//EN"
         "mep.dtd" >

{Selection of TEI tag sets 23}

{Notation declarations 27}




1.4.3.2. Data-capture forms general description
The driver files for the three data-capture-form DTDs are called mep1drv.dtd, mep1xdrv.dtd, mep2drv.dtd, mep2xdrv.dtd, mep3drv.dtd; mep3xdrv.dtd; they each contain
declarations of the parameter entities L1, L2, and L3, which control the inclusion or exclusion of different element declarations
a declaration of the parameter entities TEI.XML, om.RR, and om.RO, which are used in the TEI DTD and the MEP extensions to control SGML/XML variations of the DTD
declarations of the MEP extensions files (the parameter entities TEI.extensions.ent and TEI.extensions.dtd)
declarations of the TEI tag sets to be included in the MEP DTD
a declaration of the TEI DTD driver file (as noted above, this is made necessary by the fact that we are using our own driver, rather than the TEI's driver file)
declarations of various ISO entity sets
For each file, we generate a companion file which lacks the declarations of the ISO entity sets; this no-ents file is made necessary only by a shortcoming in the Carthage program and is not part of the package sent to users. The file which is given to users is just a header, the no-ents material, and the entity set declarations.
1.4.3.3. Level 1 driver files
For level 1, the driver for users is this:
< 9 MEP data-capture form DTD driver, level 1 [File mep1drv.dtd] > ≡
<!-- mep1drv.dtd:  MEP driver file for data-capture DTD     -->
<!-- level 1 (paragraphs, font shifts, ed/author distinct)  -->

<!-- This is level 1 of the data capture version of the     -->
<!-- MEP tag set.  It may be invoked as

     PUBLIC "-//MEP//DTD Model Editions Partnership
                     data capture level 1 ver. 3.0//EN"

-->
{MEP data-capture level 1, no-entities driver 10}

{ISO entity-set declarations (SGML) 25}




And the no-entities version is this:
The XML version of the level 1 drivers is:
< 11 MEP data-capture form XML DTD driver, level 1 [File mep1xdrv.dtd] > ≡
<!-- mep1xdrv.dtd:  MEP driver file for data-capture DTD     -->
<!-- level 1 (paragraphs, font shifts, ed/author distinct)  -->

<!-- This is level 1 of the data capture version of the     -->
<!-- MEP tag set.  It may be invoked as

     PUBLIC "-//MEP//DTD Model Editions Partnership
                XML data capture level 1 ver. 3.0//EN"

-->
{MEP data-capture level 1, XML no-entities driver 12}

{ISO entity-set declarations (XML) 26}




And the XML no-entities version is this:
1.4.3.4. Level 2 driver files
For level 2, the driver for users is this:
< 13 MEP data-capture form DTD driver, level 2 [File mep2drv.dtd] > ≡
<!-- mep2drv.dtd:  MEP driver file for data-capture DTD     -->
<!-- level 2 (level 1, plus hyperlinking)                   -->

<!-- This is level 2 of the data capture version of the     -->
<!-- MEP tag set.  It may be invoked as

     PUBLIC "-//MEP//DTD Model Editions Partnership
                     data capture level 2 ver. 3.0//EN"

-->
{MEP data-capture level 2, no-entities driver 14}

{ISO entity-set declarations (SGML) 25}




And the no-entities version is this:
The XML version of the level 2 drivers is:
< 15 MEP data-capture form XML DTD driver, level 2 [File mep2xdrv.dtd] > ≡
<!-- mep2xdrv.dtd:  MEP driver file for data-capture DTD     -->
<!-- level 2 (level 1, plus hyperlinking)                   -->

<!-- This is level 2 of the data capture version of the     -->
<!-- MEP tag set.  It may be invoked as

     PUBLIC "-//MEP//DTD Model Editions Partnership
                XML data capture level 2 ver. 3.0//EN"

-->
{MEP data-capture level 2, XML no-entities driver 16}

{ISO entity-set declarations (XML) 26}




And the XML no-entities version is this:
1.4.3.5. Level 3 driver files
For level 3, the driver for users is this:
< 17 MEP data-capture form DTD driver, level 3 [File mep3drv.dtd] > ≡
<!-- mep3drv.dtd:  MEP driver file for data-capture DTD     -->
<!-- level 3 (level 2 plus tagging for better retrieval)    -->

<!-- This is level 3 of the data capture version of the     -->
<!-- MEP tag set.  It may be invoked as

     PUBLIC "-//MEP//DTD Model Editions Partnership
                     data capture level 3 ver. 3.0//EN"

-->
{MEP data-capture level 3, no-entities driver 18}

{ISO entity-set declarations (SGML) 25}




And the no-entities version is this:
The XML version of the level 3 drivers is:
< 19 MEP data-capture form XML DTD driver, level 3 [File mep3xdrv.dtd] > ≡
<!-- mep3xdrv.dtd:  MEP driver file for data-capture DTD     -->
<!-- level 3 (level 2 plus tagging for better retrieval)    -->

<!-- This is level 3 of the data capture version of the     -->
<!-- MEP tag set.  It may be invoked as

     PUBLIC "-//MEP//DTD Model Editions Partnership
                XML data capture level 3 ver. 3.0//EN"

-->
{MEP data-capture level 3, XML no-entities driver 20}

{ISO entity-set declarations (XML) 26}




And the XML no-entities version is this:
1.4.3.6. Common parts of the drivers
All three levels have the same basic structure, much of it shared by the archival form.
First, there is a common revision history.
< 21 Revision history > ≡
<!-- Revisions:
     1998-10-19 : CMSMcQ, DC : distinguish L1, L2, L3 levels,
                               suppress textcrit and some other elements
     1998-04-23 : CMSMcQ, DC : add declaration for subs
     1998-03-26 : CMSMcQ, DC : declare p, docketing, etc.
     1997-05-30 : CMSMcQ : add targets, allow pb as inclusion on doc,
                           add spkr and date attributes
     1997-03-18 : CMSMcQ : fix error in definition of What
     1996-04-19 : CMSMcQ : made new version of driver, using Sweb
-->

This code is used in < mep-drv.noents.dtd: MEP driver file (archival-form), no entities 6 > < mep-xdrv.noents.dtd: MEP driver file (archival-form), no entities 8 > < Common DTD fragment for MEP data-capture format 24 >

The data capture forms have common declarations for the TEI extension files.
< 22 Data capture extension files > ≡
<!ENTITY % TEI.extensions.ent
  PUBLIC "-//MEP//DTD Model Editions Partnership TEI entities
         ver. 3.0//EN"
         "mepdc.ent" >

<!ENTITY % TEI.extensions.dtd
  PUBLIC "-//MEP//DTD Model Editions Partnership
         TEI dtd mods ver. 3.0//EN"
         "mepdc.dtd" >

This code is used in < Common DTD fragment for MEP data-capture format 24 >

And all drivers select the same TEI base and additional tag sets.
< 23 Selection of TEI tag sets > ≡
<!ENTITY % TEI.prose    'INCLUDE' >
<!ENTITY % TEI.linking  'INCLUDE' >
<!ENTITY % TEI.analysis 'INCLUDE' >
<!ENTITY % TEI.transcr  'INCLUDE' >
<!ENTITY % TEI.figures  'INCLUDE' >

<!ENTITY % TEI.textcrit    'IGNORE' >
<!ENTITY % TEI.certainty   'IGNORE' >
<!ENTITY % TEI.names.dates 'IGNORE' >

<!ENTITY % TEI.basic.dtd
  PUBLIC   "-//TEI P4//DTD Main Document Type 2002-01//EN"
           "tei2.dtd" >
%TEI.basic.dtd;

This code is used in < mep-drv.noents.dtd: MEP driver file (archival-form), no entities 6 > < mep-xdrv.noents.dtd: MEP driver file (archival-form), no entities 8 > < Common DTD fragment for MEP data-capture format 24 >

The entity sets in SGML:
< 25 ISO entity-set declarations (SGML) [File mep.entsets.sgml] > ≡
<!-- Entity sets and Notations for MEP DTDs -->
<!-- These are in a separate file until dpp is fixed
     to handle public identifiers -->
<!-- (This material is paralleled in teilite.dtd and teilite.ntn) -->
<!-- This material last revised:
     1998-03-26 (added ISO dia) 
-->

<!ENTITY % ISOlat1 PUBLIC "ISO 8879-1986//ENTITIES Added Latin 1//EN">
%ISOlat1; 

<!ENTITY % ISOlat2 PUBLIC "ISO 8879-1986//ENTITIES Added Latin 2//EN">
%ISOlat2; 

<!ENTITY % ISOnum  
  PUBLIC   "ISO 8879-1986//ENTITIES Numeric and Special Graphic//EN">
%ISOnum;

<!ENTITY % ISOpub  PUBLIC "ISO 8879-1986//ENTITIES Publishing//EN">
%ISOpub; 

<!ENTITY % ISOdia  PUBLIC "ISO 8879-1986//ENTITIES Diacritical Marks//EN">
%ISOdia; 

This code is used in < mep-drv.dtd: MEP driver file (archival-form) 5 > < MEP data-capture form DTD driver, level 1 9 > < MEP data-capture form DTD driver, level 2 13 > < MEP data-capture form DTD driver, level 3 17 >

The entity sets in XML:
< 26 ISO entity-set declarations (XML) [File mep.entsets.xml] > ≡
<!-- Entity sets and Notations for MEP DTDs -->
<!-- These are in a separate file until dpp is fixed
     to handle public identifiers -->
<!-- (This material is paralleled in teilite.dtd and teilite.ntn) -->
<!-- This material last revised:
     1998-03-26 (added ISO dia) 
-->

<!ENTITY % ISOlat1 PUBLIC "ISO 8879:1986//ENTITIES Added Latin 1//EN"
                          "isolat1.pen" >
%ISOlat1; 

<!ENTITY % ISOlat2 PUBLIC "ISO 8879:1986//ENTITIES Added Latin 2//EN"
                          "isolat2.pen">
%ISOlat2; 

<!ENTITY % ISOnum  PUBLIC "ISO 8879:1986//ENTITIES Numeric and Special Graphic//EN"
                          "isonum.pen" >
%ISOnum;

<!ENTITY % ISOpub  PUBLIC "ISO 8879:1986//ENTITIES Publishing//EN"
                          "isopub.pen">
%ISOpub; 

<!ENTITY % ISOdia  PUBLIC "ISO 8879:1986//ENTITIES Diacritical Marks//EN"
                          "isodia.pen">
%ISOdia; 

This code is used in < mep-drvx.dtd: MEP driver file (XML archival-form) 7 > < MEP data-capture form XML DTD driver, level 1 11 > < MEP data-capture form XML DTD driver, level 2 15 > < MEP data-capture form XML DTD driver, level 3 19 >

< 27 Notation declarations > ≡
<!NOTATION CGM PUBLIC
  'ISO 8632:1987//NOTATION Computer Graphics Metafile//EN' >

<!NOTATION CGMCHAR PUBLIC
  'ISO 8632-2:1987//NOTATION CGM Character Encoding//EN' >

<!NOTATION JPEG PUBLIC
  'ISO DIS 10918//NOTATION JPEG Graphics Format//EN' >

<!NOTATION TIFF PUBLIC
  '-//Aldus Corporation//NOTATION Tagged Image File Format//EN' >

<!NOTATION GIF PUBLIC
  '-//Compuserve Information Service//NOTATION Graphics Interchange Format//EN' >

<!NOTATION SGML PUBLIC
  'ISO 8879:1986//NOTATION Standard Generalized Markup Language//EN' >

<!NOTATION WSD  PUBLIC
  '-//TEI P3-1994//NOTATION Writing System Declaration//EN' >

This code is used in < mep-drv.noents.dtd: MEP driver file (archival-form), no entities 6 > < mep-xdrv.noents.dtd: MEP driver file (archival-form), no entities 8 > < Common DTD fragment for MEP data-capture format 24 >

The entity declarations common to all SGML drivers are:

1.4.4. A note on some DTD conventions

This document is easiest to follow if the reader understands the basic structure of the TEI DTD and its extension mechanisms; it may be that without such knowledge the document is wholly opaque. Some conventions of the TEI DTD may be worth mentioning nonetheless.
First, the elements of the TEI DTD are organized into classes, which affect which attributes are allowed on an element and where the element can occur. Parameter entities like a.global and m.refsys represent, respectively, the attributes which belong to a class, and the members of a class.
When the TEI DTD was modified for XML conformance, the Incl class was defined; this class is visible in the parameter entity m.Incl. References to m.Incl are included in almost all content models of the DTD, in order to allow some elements which used to be high-level inclusion exceptions to appear. Similar references occur at many places in the elements defined in this document. An exception is in the elements defined for targets; the members of class Incl don't seem to us to be necessary or desirable there.

2. Document Analysis

[This section to be added, with subsections on:
Letters, etc.
Diaries
Legislative Journals, Debates, and the Problem of Speaker Attribution
Image Editions and TEI Envelopes
Indexing
Structure of a Volume
The initial work on these was done at the MEP steering committee meeting in October, 1995, but full drafts of these additions await time for their completion.]

3. Overall Structure of a Volume

Documentary editions typically consist of a number of volumes; each volume has its own front and back matter, and consists in the simple case of a number of selected documents, each potentially introduced and followed by editorial matter (headnote, source note, etc.). Formally, we might write a grammar for a simple edition thus:
edition   ::= volume+
volume    ::= volume-front-matter, documents, volume-back-matter
documents ::= document+
document  ::= doc-front-matter, doc-body, doc-back-matter
doc-body  ::= front?, body, back?
The doc-front-matter and doc-back-matter contain editorial material; the doc-body itself contains the actual transcription of the document, which may (e.g. in the case of pamphlets, long reports, or other formally prepared material) have its own front and back matter.
In more complex cases, however, editorial commentary may concern not individual documents but groups of documents. Volume 19 of The Papers of Thomas Jefferson, for example, contains documents grouped into several different series, interspersed with individual documents not placed in groups. A conventional table of contents for the volume might look something like this:[2]
  • Advisory Committee [vi]
  • Acknowledgements [vi]
  • Guide to Editorial Apparatus [vii]
  • Contents [xix]
  • Illustrations [xxxi]
  • Jefferson Chronology [2]
  • Locating the Federal District [3]
    • Editorial Note [3]
    • I James Madison's Advice on Executing the Residence Act [before 29 Aug. 1790] [58]
    • II The President to the Secretary of State, 2 Jan. 1791 [61]
    • III The President to the Secretary of State [4 Jan. 1791] [62]
    • IV Daniel Carroll to the Secretary of State, 22 Jan. 1791 [62]
    • V The Secretary of State to Daniel Carroll, 24 Jan. 1791 [63]
    • VI The President to the Senate and the House of Representatives [24 Jan. 1791] [64]
    • VII The Proclamation by the President, 24 Jan. 1791 [65]
    • VIII Daniel Carroll to the Secretary of State, 27 Jan. 1791 [67]
    • IX The Secretary of State to the Commissioners of the Federal District, 29 Jan. 1791 [67]
    • X The President to the Secretary of State [1 Feb. 1791] [68]
    • XI The Secretary of State to Andrew Ellicott, 2 Feb. 1791 [68]
    • XII Andrew Ellicott to the Secretary of State [14 Feb. 1791] [70]
    • XIII The Proclamation of the President, 30 Mar. 1791 [72]
  • To Stephen Cathalan, 25 Jan. 1791 [74]
  • From John Harvie, Jr., 25 Jan. 1791 [74]
  • To Adrian Petit, 25 Jan. 1791 [76]
  • Death of Franklin: The Politics of Mourning in France and the United States
    • Editorial Note [78]
    • I The Secretary of State to the President, 9 Dec. 1790 [106]
    • II Tobias Lear to the Secretary of State, 26 Jan. 1791 [108]
    • III Thomas Jefferson to the Rev. William Smith, 19 Feb. 1791 [112]
    • IV Thomas Jefferson to John Vaughan, 22 Feb. 1791 [114]
    • V The Secretary of State to the President of the National Assembly of France, 8 Mar. 1791 [114]
  • From Childs & Swaine, 27 Jan. 1791 [115]
  • From Tench Coxe, 27 Jan. 1791 [116]
  • From William Short, 28 Jan. 1791 [117]
  • From Tench Coxe [30 Jan. 1791] [118]
  • From Madame de Rausan [30 Jan. 1791] [119]
  • From Noah Webster [31 Jan. 1791] [120]
  • Memoranda and Statistics on American Commerce [121]
    • Editorial Note [121]
    • I Jefferson's Notes on Sheffield's Observations ... [1783-84?] [127]
    • II Jefferson's Notes on Coxe's Commercial System for the United States [ca. 1787] [132]
    • III Extracts from Speech of William Pitt, April 1790 [133]
    • IV Notes on American Trade with Ireland [1790?] [134]
    • V Estimate of American Imports [1785-86?] [135]
    • VI Estimate of American Exports [1785-86?] [136]
    • VII Exports from Eight Northern States before 1776 [ca. 1784] [139]
  • [etc.]
We need, therefore, to alter our grammar somewhat, to account for front matter attached to whole groups of documents:
edition   ::= volume+
volume    ::= volume-front, documents, volume-back
documents ::= (document | doc-group)+
doc-group ::= grp-front, documents, grp-back
document  ::= doc-front, doc-body, doc-back
doc-body  ::= front?, body, back?
The Jefferson volume does not have nested document groups, but it is implausible to assume document groups never nest; we define a document group, therefore, as containing a series of individual documents, or of document groups.
This grammar differs from the TEI's predefined markup for anthologies primarily in distinguishing more rigorously between preliminary matter transcribed as part of the document (front) and preliminary matter supplied by the editors (volume-front, grp-front, and doc-front). A further difference arises from the practical desirability of providing a separate TEI header for each document, so that the document header and the document transcription can be held in the same operating system file, if desired. (TEI markup for corpora allows such tight linking between an individual document and a TEI header, but TEI corpora have no corpus-level front and back matter.) These differences are reflected in MEP's redefinition of the TEI <group> element as containing not merely a series of <text> elements, but a series of <document> elements, which each may contain a TEI header and editorial front and back matter, as well as a <text> element.
[List tags:
    <tei.2>
  • contains an entire edition, volume, or sample
  • <teiHeader>
  • documents the edition as a whole, or one document in particular
  • <text>
  • encloses the edition as a whole, or a document included in the edition
  • <front>
  • in the edition as a whole, contains editorial front matter; within a transcribed document, contains transcribed front matter
  • <back>
  • in the edition as a whole, contains editorial back matter; within a transcribed document, contains transcribed back matter
  • <body>
  • contains the body of a unitary text (exclusive of transcribed front and back matter)
  • <docGroup>
  • forms the body of an edition, volume, or sample; contains a series of <document> or <docGroup> elements
  • <document>
  • contains one document, together with attached editorial material: a <teiHeader> for the document giving its identity and repository information, followed by a <text> element containing the transcribed document itself; the <text> element may optionally be surrounded by <edFront> and <edBack> elements, containing editorial front and back matter
  • <doc>
  • a flatter alternate form of <document>, for simpler data capture: contains one document, together with attached editorial material: a series of elements describing the document's identity and repository information, followed by a <docBody> element containing the transcribed document itself; the <docBody> element may optionally be preceded and followed by editorial front and back matter. The type attribute allows an editor to classify the document according to a typology of the editor's choice.
  • <edFront>
  • contains editorial front matter for a document or document group
  • <edBack>
  • contains editorial back matter for a document or document group
Analysed in this way, the body of the Jefferson volume would have something like the following structure:

<tei.2>     <!-- volume 19 -->
  <teiHeader> <!-- header for volume ... --> </teiHeader>
  <text>
    <front> ... </front>
    <documentGroup>
      <documentGroup>
        <edFront>
          <head>Locating the Federal District</head>
          <list> ... </list> <!-- contents of group -->
          <div type='ednote'>
            <head>Editorial Note</head>
            <epigraph> ...</epigraph>
            <p>In the fall of 1790 ...</p>
            ...
          </div>
        </edFront>
        <document> <!-- I James Madison's Advice ... -->
          <teiHeader>...</teiHeader>
          <text> ... </text>
        </document>
        <document> <!-- II The President to the Secretary of State -->
          <teiHeader>...</teiHeader>
          <text> ... </text>
        </document>
        <document> <!-- III The President to the Secretary of State -->
          <teiHeader>...</teiHeader>
          <text> ... </text>
        </document>
        <document> <!-- IV Daniel Carroll to the Secretary of State -->
          <teiHeader>...</teiHeader>
          <text> ... </text>
        </document>
        ...
      </documentGroup>
    </documentGroup>
    <back>
    </back>
  </text>
</tei.2>

For the record, the front matter of the Jefferson volume would have a structure like this:

    <front>
      <titlePage> ... </titlePage>
      <div type='tpverso'> <!-- copyright etc. ... --></div>
      <div type='dedication'> ... </div>
      <div type=''><head>Advisory Committee</head> ... </div>
      <div type='ack'><head>Acknowledgements</head> ... </div>
      <div type='section'><head>Guide to Editorial Apparatus</head>
        <div n='1'><head>Textual Devices</head> ... </div>
        <div n='2'><head>Descriptive Symbols</head> ... </div>
        <div n='3'><head>Location Symbols</head> ... </div>
        <div n='4'><head>Other Symbols and Abbreviations</head> ... </div>
        <div n='5'><head>Short Titles</head> ... </div>
      </div>
      <div type='toc'><head>Contents</head> ... </div>
      <div type='toc'><head>Illustrations</head> ... </div>
      <div type='chron'><head>Jefferson Chronology</head> ... </div>
    </front>
</tei.2>

We make this tagging possible first by renaming <group> as <docGroup> to make its import clearer.
< 30 Renaming of TEI elements (archival and data-capture forms) > ≡
<!ENTITY % n.group 'docGroup' >

This code is used in < mep.ent: MEP extensions.ent file (archival form) 1 > < mepdc.ent: MEP extensions.ent file 2 >

Then we give it a new definition, to accommodate documents and document groups rather than just <text> elements.
< 31 Document Groups (archival form) > ≡
<!ELEMENT docGroup %om.RR; ((%m.Incl;)*, 
                            (edFront, (%m.Incl;)*)?,
                            ((document | surrogate 
                            | target | targets
                            | docGroup), (%m.Incl;)*)+,
                            (edBack, (%m.Incl;)*)?)              >
<!ATTLIST docGroup         %a.global;                            >

This code is used in < mep.dtd: MEP extensions.dtd file (archival form) 3 >

This is for the archival form; the data capture form is very similar, but uses the name <doc>, not <document> (this is to help make it more obvious whether a given file is in the data capture or the archival form).
< 32 Document Groups (data capture form) > ≡
<![ %L1; [
<!ELEMENT docGroup %om.RR; ((%m.Incl;)*,
                            (edFront, (%m.Incl;)*)?,
                            ((doc | surrogate 
                            | target | targets
                            | docGroup), (%m.Incl;)*)+,
                            (edBack, (%m.Incl;)*)?)              >
<!ATTLIST docGroup         %a.global;                            >
]]>

This code is used in < mepdc.dtd: MEP extensions.dtd file (data-capture form) 4 >

4. Structure of a Document

The internal structure of a <document> element has already been described in brief: documents contain a TEI header, optional editorial front matter, the text of the document itself, and optional editorial back matter. The header is basically the same as in the TEI Guidelines, though it should eventually be extended to ensure that specialized information about endorsements, etc., can be recorded usefully. The editorial front and back matter is similar in structure to the standard TEI front and back matter (though we may rationalize the structure slightly). The MEP markup for the transcription of the document itself differs structurally from the Guidelines primarily in having more explicit elements for sender, addressee, etc., and in making better provision for documents enclosed within other documents.
The markup for document structure varies significantly between the data capture DTD and the archival DTD: the former is somewhat simpler to type, the latter is somewhat simpler to process. Examples below.

4.1. Document Headers

In MEP samples, each document will be stored in a file by itself, which will include a TEI header. Principles:
  • data entry DTD will have a distinct form for this information
  • archival form of this document header will be identical to standard TEI header
Alternative structure were also considered for the functions served by the <document> element:
  • within the content model of <group>, replacing references to <text> with references to <tei.2>. (In effect, we have replaced them instead with references to <document> and <surrogate>.)
  • defining a <document> element as edFront?, teiHeader, text, edBack? or as teiHeader, edFront?, text, edBack? (In effect, we have taken the latter approach.)
[Need discussion of how this structure interacts with enclosures, if enclosures are not treated as separate documents.]
Elements needed for the data-capture header:
<preparedBy>
name or initials of the staff member who prepared the document in its current state
<prepDate>
date when the document was prepared in its current state
<copyright>
standard copyright statement [check description]
<permissions>
standard acknowledgement of permission to republish the document [check description]
Optional (these can be omitted if they can be inferred from the editorial front matter):
<docTitle>
title of the document, possibly supplied by editor (may be copied from the <head> element at the beginning of the <edFront>)
<docAuthor>
author / sender of the document (may be copied from the <sender> or <docAuthor> element(s) found in the <edFront> element)
<docDate>
date of the document (may be copied from the <date> element(s) found in the <dateline> element(s) in the document proper or editorial front matter)
<sourceDesc>
information about the source of the document [may be copied from the <sourceNote> automatically]
The archival form of the header needs no special definition (in this version of this DTD); the data-capture version requires that we declare an element class (m.mepHeader) and provide element declarations for all the elements not already defined by TEI P3.
The element class is fairly simple:
< 33 Header elements (data capture form) > ≡
<!ENTITY % x.mepHeader '' >
<!ENTITY % m.mepHeader '%x.mepHeader; preparedBy | prepDate
| copyright | permissions | docTitle | docAuthor | docDate
| dateline | sourceDesc | idno
| addressee | sender | auth | respStmt
| repository | docLanguage' >

This code is used in < mepdc.ent: MEP extensions.ent file 2 >

(Note that we allow both <docAuthor> and <sender>. These are in some respects functionally equivalent, but the one seems much more natural for letters, and the other for other kinds of documents.)
The elements themselves all have very conventional declarations:
< 34 Elements for document header (data-capture form) > ≡
<![ %L1; [
<!ELEMENT preparedBy    %om.RO;  %paraContent;                  >
<!ATTLIST preparedBy             %a.global;                     >
<!ELEMENT prepDate      %om.RO;  %paraContent;                  >
<!ATTLIST prepDate               %a.global;                     >
<!ELEMENT copyright     %om.RO;  %paraContent;                  >
<!ATTLIST copyright              %a.global;                     >
<!ELEMENT permissions   %om.RO;  %paraContent;                  >
<!ATTLIST permissions            %a.global;                     >
<!ELEMENT docLanguage   %om.RO;  %paraContent;                  >
<!ATTLIST docLanguage            %a.global;                     >
<!ELEMENT mepHeader     %om.RO;  (%m.mepHeader;)*               >
<!ATTLIST mepHeader              %a.global;                     >
]]>

This code is used in < mepdc.dtd: MEP extensions.dtd file (data-capture form) 4 >

The MEP header should be tightened up a lot.

4.2. Document Enclosures

Many documents have enclosed / appended documents. Some editions treat these as more or less independent entities; others print them in special formats immediately after their ‘parent’ document.
Several methods could be used to handle enclosures. First, we could replace references to document with references to (document, enclosure*). This amounts to redefining <docGroup>:

<!-- method 1 (sample) -->
<!ELEMENT docGroup - - (edFront?,
                        ( (document, enclosure*)
                          | surrogate
                          | docGroup
                        ),
                        edBack?) >
<!ELEMENT document  - - (edFront?, text, edBack?) >
<!ELEMENT enclosure - - (edFront?, text, edBack?) >

A second possibility would involve redefining <document> to include <enclosure> as well as <text>:

<!-- method 2 (sample) -->
<!ELEMENT docGroup  - - (edFront?,
                        (document| surrogate | docGroup)+,
                        edBack?) >
<!ELEMENT document  - - (edFront?, text, enclosure*, edBack?) >
<!ELEMENT enclosure - - (edFront?, text, enclosure*, edBack?) >

The third obvious possibility is to redefine <text> to include <enclosure> as well as <body> and <back>:

<!-- method 3 (sample) -->
<!ELEMENT text - - (front?, body, back?, enclosure*) >
<!-- or -->
<!ELEMENT text - - (front?, body, enclosure*, back?) >
<!-- or -->
<!ELEMENT text - - (front?, body, back?, enclosure*, back?) >

It seems to us (DC, MSM) that the second approach (redefining <document>) makes most sense. That is what is done in this version of this document.
[We have not assigned a structure to enclosures themselves:
  • do they have their own editorial front and back matter? I assume not, since if they did we could treat them as free-standing documents.
  • do they have the same structure as <text>? Or perhaps do they contain a single <text> element (or a single <docBody>, in the data capture format)?
I assume that
  • enclosures do not have independent editorial material (if notes to the main document precede the printing of the enclosures, either the enclosures need to be treated as independent documents, or the <document> and <enclosure> elements need to be redefined)
  • enclosures may be self-contained documents in themselves, and therefore can be (and, in the archival form, should be) tagged as <text> elements.
  • in cases (such as HL to John Laurens, 1775-08-20) where the sender has included connective text between items in a series of enclosures, the enclosures should be treated as <text> elements embedded in a normal series of paragraphs. (Tagging unconnected or ‘loose’ enclosures thus keeps the tagging of enclosures and the tagging of enclosures with connective text similar to each other.)
For now, we define an <enclosure> as containing a <text> (or, in the data capture format, a <text> or <docBody>) element, preceded by optional editorial front matter and followed by optional enclosures and back matter. (This is the way the February DTD defines the element.)
< 35 Enclosures (archival form) > ≡
<!ELEMENT enclosure     %om.RR;  ((%m.Incl;)*, 
                                 (edFront, (%m.Incl;)*)?, 
                                 text,
                                 (enclosure | (%m.Incl;))*, 
                                 (edBack, (%m.Incl;)*)?)         >
<!ATTLIST enclosure              %a.global;
                                 %a.declaring;
          TEIform                CDATA               'text'     >

This code is used in < mep.dtd: MEP extensions.dtd file (archival form) 3 >

< 36 Enclosures (data capture form) > ≡
<![ %L1; [
<!ELEMENT enclosure     %om.RR;  ((%m.edFront; | %m.Incl;)*,
                                 (text | docBody),
                                 (enclosure | %m.Incl;)*,
                                 ((%m.edBack;), (%m.Incl;)*)*)      >
<!ATTLIST enclosure              %a.global;
                                 %a.declaring;
          TEIform                CDATA               'text'     >
]]>

This code is used in < mepdc.dtd: MEP extensions.dtd file (data-capture form) 4 >

4.3. Editorial Front and Back Matter

Editorial front and back matter applies to individual documents and document groups. In the simple case, it contains mostly small components like headings, short headnotes, or endnotes. In more elaborate cases, the headnote can have all the internal structure of an essay. We define editorial front and back matter, therefore, as containing a mix of chunk-level items specific to them, and normal text divisions.
In the entity modification files, we need to define the classes m.edFront and m.edBack:
< 37 Define new element classes (archival and data-capture forms) > ≡
<!ENTITY % m.edFront 'head | sourceNote | headNote | note | dateline
| byline | docAuthor | docDate'  >
<!ENTITY % m.edBack  'note | sourceNote | endNote'
 >
Continued in <New attributes for speaker and date 49>, <Index references 52>
This code is used in < mep.ent: MEP extensions.ent file (archival form) 1 > < mepdc.ent: MEP extensions.ent file 2 >

In the element declaration files, we need to declare the two elements themselves:
< 38 Editorial front and back matter (archival and data-capture forms) > ≡
<![ %L1; [
<!ELEMENT edFront       %om.RO;  ( (%m.Incl;)*,
                                 ( ((%m.edFront;), (%m.Incl;)*)+
                                 | ((%n.div;), (%m.Incl;)*)+
                                 | ((%n.div1;), (%m.Incl;)*)+ )) >
<!ATTLIST edFront                %a.global;
                                 %a.declaring;
          TEIform                CDATA               'front'    >

<!ELEMENT edBack        %om.RO;  ( (%m.Incl;)*,
                                 ( ((%m.edBack;), (%m.Incl;)*)+
                                 | ((%n.div;), (%m.Incl;)*)+
                                 | ((%n.div1;), (%m.Incl;)*)+ ))  >
<!ATTLIST edBack                 %a.global;
                                 %a.declaring;
          TEIform                CDATA               'back'     >
]]>

This code is used in < mep.dtd: MEP extensions.dtd file (archival form) 3 > < mepdc.dtd: MEP extensions.dtd file (data-capture form) 4 >

(In a project using SGML natively, editorial matter would not be added during transcription and thus would not appear in ‘level-one’ documents. But these elements need to be in level one for conversion of legacy data. Ditto endnote, etc.)
In the data capture format, these elements are declared in the same way, but they are not usually needed; their content can appear directly within a <doc> element. They are declared anyway so that they can be used within a <docGroup> element.

4.4. Document Structure

The archival form will explicitly group editorial front and back matter, and the opening and closing material at the beginning and end of letters. The data capture form will not group such matter explicitly, but leave the groupings largely implicit.
For example, here is the overall structure of a document from volume 19 of the Jefferson papers:

<document>
  <teiHeader>...</teiHeader>
  <text>
    <edFront>
      <head>James Madison's Advice ...</head>
      <dateline><date>[Before 19 Aug. 1790]</date></dateline>
    </edFront>
    <body>
      <p><title>The act for establishing ... </title></p>
      <list>
        <item n='1'>The appointment of three Commissioners ...</item>
        <item n='2'>That the President inform himself ...</item>
        <item n='3'>That the President direct the survey ...</item>
        <item n='4'>The district being defined ...</item>
        <item n='5'>The plan for the public buildings ...</item>
        <item n='6'>The completion of the work ...</item>
      </list>
    </body>
  </text>
  <edBack>
    <sourceNote>MS (DLC:  Madison Papers) ...</sourceNote>
    <note type='commentary'>
      <p>This document came to be attributed to TJ ...</p>
      <p>For searching various series ...</p>
    </note>
    <note n='1' target='N1'>The text of this query ... </note>
  </edBack>
</document>

Here is a document from the Papers of Henry Laurens:[3]

<document>
  <teiHeader> ... </teiHeader>
  <edFront>
    <head>To <addressee>Martha Laurens</addressee></head>
    <dateline>
      <place>Charles Town</place>,
      <date>August 17, 1776</date>
    </dateline>
  </edFront>
  <text>
    <body>
      <salute>My Dear Daughter</salute>
      <p>It is now upwards of twelve Months ... </p>
      <p>Your Brother will tell you a great deal ... </p>
      <p>I anxiously wish to See you my Daughter ... </p>
      <p>You will be rejoiced to learn ... </p>
      <p>I have no doubt my Dear Daughter ... </p>
      <p>You will take care of my Polly too ... </p>
      <signed>your affectionate Father</signed>
      <ps>
        <date>19th</date>
        <p>Casting my Eye over ... </p>
      </ps>
    </body>
  </text>
  <edBack>
    <sourceNote>LB, HL Papers, ScHi; ...</sourceNote>
  </edBack>
</document>

The data-capture form of these two letters has somewhat less markup, since it relies on later processors to know that (for example) a source note cannot occur in the body of a document and therefore the beginning of a source note must inherently signal the beginning of the enclosing <edBack> element.
In data capture format, the document from the Jefferson papers differs from the archival form primarily in renaming <document> as <doc>, dropping the <text>, <front>, <edFront> and <edBack> tags, and substituting a <docBody> for the <body> element.

<doc>
  <mepHeader>...</mepHeader>
  <head>James Madison's Advice ...</head>
  <dateline><date>[Before 19 Aug. 1790]</date></dateline>
  <docbody>
    <p><title>The act for establishing ... </title></p>
    <list>
      <item n='1'>The appointment of three Commissioners ...</item>
      <item n='2'>That the President inform himself ...</item>
      <item n='3'>That the President direct the survey ...</item>
      <item n='4'>The district being defined ...</item>
      <item n='5'>The plan for the public buildings ...</item>
      <item n='6'>The completion of the work ...</item>
    </list>
  </docbody>
  <sourceNote>MS (DLC:  Madison Papers) ...</sourceNote>
  <note type='commentary'>
    <p>This document came to be attributed to TJ ...</p>
    <p>For searching various series ...</p>
  </note>
  <note n='1' target='N1'>The text of this query ... </note>
</doc>

The Laurens letter has similar simplifications:

<document>
  <mepHeader> ... </mepHeader>
  <head>To <addressee>Martha Laurens</addressee></head>
  <dateline>Charles Town, August 17, 1776</dateline>
  <docBody>
    <salute>My Dear Daughter</salute>
    <p>It is now upwards of twelve Months ... </p>
    <p>Your Brother will tell you a great deal ... </p>
    <p>I anxiously wish to See you my Daughter ... </p>
    <p>You will be rejoiced to learn ... </p>
    <p>I have no doubt my Dear Daughter ... </p>
    <p>You will take care of my Polly too ... </p>
    <signed>your affectionate Father</signed>
    <ps>
      <date>19th</date>
      <p>Casting my Eye over ... </p>
    </ps>
</docBody>
  <sourceNote>LB, HL Papers, ScHi; ...</sourceNote>
</document>

The automatic processing to translate from the data capture form to the archival form will simply do the following:
  • Everything between the end of the MEP header or TEI header and the beginning of the <docBody> element is tagged as an <edFront>.
  • Everything between the end of the <docBody> element and the end of the <doc> element is tagged as an <edBack>.
  • The <docBody> itself is translated into a <text> element containing a <body>. If the transcription of the document requires its own transcribed front or back matter, then a <text> element should be used in place of the <docBody> element.
< 39 Document (archival form) > ≡
<!ELEMENT document %om.RR; (teiHeader, (%m.Incl;)*,
                           (edFront, (%m.Incl;)*)?, 
                           text, 
                           (enclosure | %m.Incl;)*, 
                           (edBack, (%m.Incl;)*)?)               >
<!ATTLIST document         %a.global;                           >


This code is used in < mep.dtd: MEP extensions.dtd file (archival form) 3 >

< 40 Document (data capture form) > ≡
<![ %L1; [
<!ELEMENT doc           %om.RR;  ((teiHeader | mepHeader),
                                 (%m.edFront; | %m.Incl;)*,
                                 (text | docBody), 
                                 (enclosure | %m.Incl;)*,
                                 ((%m.edBack;), (%m.Incl;)*)*)   >
<!ATTLIST doc              %a.global;  
             type             CDATA                  #IMPLIED    >
<!ELEMENT docBody       %om.RR;  ((%m.divtop; | %m.Incl;)*,
                                 ( ((div, (%m.Incl;)*)+ 
                                 | (div0, (%m.Incl;)*)+ 
                                 | (div1, (%m.Incl;)*)+)
                                 | ( ((%component;), (%m.Incl;)*)+,
                                 ((div, (%m.Incl;)*)* 
                                 | (div0, (%m.Incl;)*)* 
                                 | (div1, (%m.Incl;)*)*))),
                                 ((%m.divbot;), (%m.Incl;)*)*)   >
<!ATTLIST docBody          %a.global;                           >
]]>

This code is used in < mepdc.dtd: MEP extensions.dtd file (data-capture form) 4 >

Note that the <teiHeader> is defined only in the level 3 DTD.

4.5. Document Surrogates

In some editions (e.g. the Papers of General Nathanael Greene), some documents are formally represented not by a transcription but by a regest or summary, which acts as a surrogate for the full document. In order to distinguish surrogates clearly from full transcriptions, they are tagged <surrogate> instead of <document>.
A <surrogate> element has the same structure as a <document> element, except that the <teiHeader> element is optional and the <text> or <docBody> element is replaced by a single <body> element.
In the archival form, <surrogate> has the following definition:
< 41 Document surrogates (archival form) > ≡
<!ELEMENT surrogate     %om.RR;  ((teiHeader, (%m.Incl;)*)?, 
                                 (edFront, (%m.Incl;)*)?, 
                                 (body, (%m.Incl;)*),
                                 (edBack, (%m.Incl;)*)?)         >
<!ATTLIST surrogate              %a.global;                     >

This code is used in < mep.dtd: MEP extensions.dtd file (archival form) 3 >

Alternate form (from February):

<!-- alternate declaration -->
<!ELEMENT surrogate     %om.RR;  ((%m.divtop; | %m.Incl;)*, 
                                  ((%component;), (%m.Incl;)*)+,
                                  (((%n.div1;), (%m.Incl;)*)* 
                                  | ((%n.div;), (%m.Incl;)*)*),
                                  ((%m.divbot;), (%m.Incl;)*)*)  >
<!ATTLIST surrogate              %a.global;
                                 %a.declaring;
          TEIform                CDATA               'text'     >

In the data capture form, <surrogate> has the following definition:
< 42 Document surrogates (data capture form) > ≡
<![ %L1; [
<!ELEMENT surrogate     %om.RR;  ((teiHeader | mepHeader),
                                 (%m.edFront; | %m.Incl;)*,
                                 docBody, (%m.Incl;)*,
                                 ((%m.edBack;), (%m.Incl;)*)*)   >
<!ATTLIST surrogate              %a.global;                     >
]]>

This code is used in < mepdc.dtd: MEP extensions.dtd file (data-capture form) 4 >

4.6. Targets

We add a specialized document type for image targets.
The preliminary version of this DTD fragment is available only in the data capture form, and is strictly modeled on a very small sample of MSP targets.
< 43 Elements for MSP targets > ≡
<!ELEMENT targets       %om.RR;  (mepHeader?,
                                 (target | targetInfo)*)        >
<!ATTLIST targets                %a.global;                     >

<!ELEMENT target        %om.RR;  (series, (%m.docnum;), 
                                 (date | dateRange), place,
                                 extent, title, permissions, notes,
                                 refs, figure*, xref*)          >
<!ATTLIST target                 %a.global;                     >

<!ELEMENT targetInfo    %om.RR;  %targetInfoElement;            >
<!ATTLIST targetInfo             %a.global;                     >

<!ELEMENT series        %om.RR;  %targetInfoElement;            >
<!ATTLIST series                 %a.global;                     >
<!ELEMENT msp           %om.RR;  %targetInfoElement;            >
<!ATTLIST msp                    %a.global;                     >

<!ELEMENT notes         %om.RR;  (sourceType, note*)*           >
<!ATTLIST notes                  %a.global;                     >
<!ELEMENT refs          %om.RR;  (reference | xref)*            >
<!ATTLIST refs                   %a.global;                     >
<!ELEMENT reference     %om.RR;  %targetInfoElement;            >
<!ATTLIST reference              %a.global;                     >

<!ELEMENT repository    %om.RR;  %targetInfoElement;            >
<!ATTLIST repository             %a.global;                     >
<!ELEMENT sourceType    %om.RR;  %targetInfoElement;            >
<!ATTLIST sourceType             %a.global;                     >


This code is used in < mepdc.dtd: MEP extensions.dtd file (data-capture form) 4 >

The <nav> element is intended to hold pointers to other documents or pages, which will be used to generate navigational aids at the bottom of the display of the target, in the browser. In a perfect world, the delivery system would allow us to do this using virtual documents, generated from the other information in the target; the <nav> elements are strictly superfluous, and thus error-prone. But we live, alas, in this world, not that perfect one.
< 44 Define nav element > ≡
<![ %L2; [
<!ELEMENT nav           %om.RO;  %paraContent;                  >
<!ATTLIST nav                    %a.global;
          teiForm                CDATA                 "p"      >
]]>

This code is used in < Specialized components (archival and data-capture forms) 46 >

Strictly speaking, this element belongs in a specialized delivery DTD.

4.7. Classes for Target Elements

The target DTD has some special element classes:
< 45 Classes for Targets > ≡
<!ENTITY % m.role "sender | addressee | author">
<!ENTITY % m.editorial "supplied">
<!ENTITY % m.name  "repository | person | org | name">
<!ENTITY % m.crossref "ptr | ref | xptr | xref">
<!ENTITY % m.docnum   "msp | idno"             >


<!ENTITY % phrase.edit "(#PCDATA | %m.editorial;)*"               >
<!ENTITY % phrase.role "(#PCDATA | %m.role; | %m.editorial;)*"      >
<!ENTITY % phrase.name
           "(#PCDATA | %m.name; | %m.role; | %m.editorial;)*"         >


<!ENTITY % m.targetInfo "series | msp | idno | date | dateRange 
| place | extent | title | permissions | notes | sourceNote | refs 
| figure | nav | %m.role; | %m.editorial; | %m.name; | %m.crossref;" >

<!ENTITY % targetInfoElement "(#PCDATA | %m.targetInfo;)*" >


This code is used in < mepdc.ent: MEP extensions.ent file 2 >

5. Document Components

5.1. Specialized Division-Top and Division-Bottom Elements

A final list of the specialized elements for the top and bottom of elements must await completion of the formal paper on document analysis. For now, we will work with block-level elements for
<closing>
closing flourish of a letter (phrases like I am Dear Sir yr most hbl and obt svt.)
<signed>
signature of a letter (a TEI element)
<attestation>
certification that a copy or signature is authentic (e.g. by a notary, secretary, or secretary of state)
<endorsement>
administrative inscription on a document indicating date received, etc. (not to be used where <docketing> applies)
<docketing>
administrative inscription on a document indicating its subject matter, schedule on a court docket, and/or disposition
<ps>
postscript
Some elements which might plausibly be treated as block-level elements in this section are, by MEP, treated instead as phrase-level elements. In particular, salutations (<salute>), closing flourishes (<closing>), and names of senders, addressees, etc. (<sender>, <addressee>, and <auth>) are all treated as phrase-level elements.
The new elements we use are defined thus; postscripts have the same model as <div7>; if text divisions can appear within a postscript, we should change the content model. We take over the TEI <dateline> element, but give it a more permissive and realistic content model.
< 46 Specialized components (archival and data-capture forms) > ≡
<![ %L3; [
<!ELEMENT closing       %om.RO;  %paraContent;                  >
<!ATTLIST closing                %a.global;
          TEIform                CDATA               'salute'   >

<!ELEMENT attestation   %om.RO;  %paraContent;                  >
<!ATTLIST attestation            %a.global;
          TEIform                CDATA               'p'        >

<!ELEMENT endorsement   %om.RO;  %paraContent;                  >
<!ATTLIST endorsement            %a.global;
          TEIform                CDATA               'p'        >

<!ELEMENT docketing     %om.RO;  %paraContent;                  >
<!ATTLIST docketing              %a.global;
          TEIform                CDATA               'p'        >

<!ELEMENT ps            %om.RR;  ((%m.divtop; | %m.Incl;)*,
                                  ((%component;), (%m.Incl;)*)+,
                                  ((%m.divbot;), (%m.Incl;)*)*)  >
<!ATTLIST ps                     %a.global;                     >
]]>
<![ %L1; [
<!ELEMENT %n.dateline;  %om.RO;  %paraContent;                  >
<!ATTLIST %n.dateline;           %a.global;
          TEIform                CDATA               'dateline' >
]]>
{The WHAT element 48}
 
{Define nav element 44}
 

This code is used in < mep.dtd: MEP extensions.dtd file (archival form) 3 > < mepdc.dtd: MEP extensions.dtd file (data-capture form) 4 >

5.2. Specialized Notes

We define the following specialized types of notes:
  • <headNote>
  • <sourceNote>
  • <endNote>
We define them thus:
< 47 Specialized notes (archival and data capture forms) > ≡
<![ %L1; [
<!ELEMENT headNote      %om.RO;  %specialPara;                  >
<!ATTLIST headNote
          id                 ID                  #IMPLIED
          lang               IDREF               %INHERITED;
          rend               CDATA               #IMPLIED
          n                  CDATA               #IMPLIED
          type               CDATA               #IMPLIED
          resp               CDATA               #IMPLIED
          place              CDATA               'unspecified'
          anchored           (yes | no)          'yes'
          target             IDREFS              #IMPLIED
          targetEnd          IDREFS              #IMPLIED
          TEIform            CDATA               'note'         >
<!ELEMENT endNote       %om.RO;  %specialPara;                  >
<!ATTLIST endNote
          id                 ID                  #IMPLIED
          lang               IDREF               %INHERITED;
          rend               CDATA               #IMPLIED
          n                  CDATA               #IMPLIED
          type               CDATA               #IMPLIED
          resp               CDATA               #IMPLIED
          place              CDATA               'unspecified'
          anchored           (yes | no)          'yes'
          target             IDREFS              #IMPLIED
          targetEnd          IDREFS              #IMPLIED
          TEIform            CDATA               'note'         >

<!ELEMENT sourceNote    %om.RO;  (#PCDATA | %m.phrase; | q 
                                 | bibl | %m.biblPart; 
                                 | %m.Incl;)*                    >
<!ATTLIST sourceNote             %a.global;
                                 %a.declarable;
          TEIform                CDATA               'bibl'     >
]]>

This code is used in < mep.dtd: MEP extensions.dtd file (archival form) 3 > < mepdc.dtd: MEP extensions.dtd file (data-capture form) 4 >

5.3. Indeterminate Components

When it is not clear exactly how to tag some block of text, the <what> element should be used. (The name is simply explained: if you don't know what to tag it as, tag it as <what>. The idea comes to us from the Women Writers Project at Brown, to whom be credit and thanks.)
The <what> element is defined as follows:
< 48 The WHAT element > ≡
<![ %L1; [
<!ELEMENT what          %om.RO;  %paraContent;                  >
<!ATTLIST what                   %a.global;
                                 %a.declarable;
          TEIform                CDATA               'p'        >
]]>

This code is used in < Specialized components (archival and data-capture forms) 46 >

5.4. Special Attributes for Components

To be able to search for given words or topics within (reported) speeches made by a given individual, we need to be able to supply spkr and date attributes on text divisions, paragraphs, and segments. First we declare the new class, then we make it part of class divn:
< 49 New attributes for speaker and date [continues 37 Define new element classes (archival and data-capture forms)] > ≡
<![ %L3; [
<!ENTITY % a.speech '
          spkr               NMTOKEN             #IMPLIED
          date               NMTOKEN             #IMPLIED'  >
<!ENTITY % a.divn "          %a.speech;
          type               CDATA               #IMPLIED
          org                (composite | uniform)
                                                 'uniform'
          sample             (initial | medial | final |
                             unknown | complete) 'complete'
          part               (Y | N | I | M | F) 'N'"      >
]]>
<!ENTITY % a.speech ''                                      >
<!ENTITY % a.divn '          %a.speech;
          type               CDATA               #IMPLIED ' >



We also need to assign this class to <p> and <seg>.
< 50 Redefine paragraph and seg > ≡
<![ %L1; [
<!ELEMENT %n.p;         %om.RO;  %paraContent;                  >
<!ATTLIST %n.p;                  %a.global;
                                 %a.speech;
          TEIform                CDATA               'p'        >
<!ELEMENT %n.seg;       %om.RR;  %paraContent;                  >
<!ATTLIST %n.seg;                %a.global;
                                 %a.seg;
                                 %a.speech;
          subtype                CDATA               #IMPLIED
          TEIform                CDATA               'seg'      >
]]>

This code is used in < mep.dtd: MEP extensions.dtd file (archival form) 3 > < mepdc.dtd: MEP extensions.dtd file (data-capture form) 4 >

5.5. Back-of-the-Book Indices

Most documentary editions rely on manually constructed indices to provide good intellectual access to the material in the edition. Few readers read documentary editions from front to back; most consult the index to find the documents relevant to their particular interest of the moment.
In some electronic documentary editions, it is therefore necessary to transcribe back-of-the-book indices. Two approaches suggest themselves: using the TEI's <list>, <item>, and <ref> elements to mark the index as a kind of specialized list, or extending the TEI scheme with specialized elements. A fragment of an index encoded the first way might look like this:[4]
<list>
  ...
  <item>China, <ref>63-64n2</ref>, <ref>265n2</ref></item>
  <item><ship>China</ship>, <ref>226n3</ref></item>
  <item><title>The Choice Humorous Works of Mark Twain</title>
    (1873, 1874), <ref>168n7</ref></item>
  <item>Cholmondeley, Mary, <ref>434</ref></item>
  <item rend="boldlabel">Cholmondeley, Reginald,
    <ref rend="bold" type="main-id">432-34</ref>,
    <ref>522n2</ref>,
    <ref>657<hi>illus</hi></ref>
    <list>
      <item>letter to, <ref>434</ref></item>
      <item>letters by, <ref>432-34</ref></item>
    </list>
  </item>
  ...
</list>
The same material, encoded with specialized elements, might look like this:
<bobIndex>
  ...
  <ixe>
    <entry>China</entry>
    <pgref>63-64n2</pgref> <pgref>265n2</pgref>
  </ixe>
  <ixe>
    <entry><ship>China</ship></entry>
    <pgref>226n3</pgref></ixe>
  <ixe>
    <entry><title>The Choice Humorous Works of Mark Twain</title>
      (1873, 1874)</entry>
    <pgref>168n7</pgref></ixe>
  <ixe>
    <entry>Cholmondeley, Mary</entry>
    <pgref>434</pgref></ixe>
  <ixe type="correspondent">
    <entry>Cholmondeley, Reginald</entry>
    <pgref rend="bold" type="main-id">432-34</pgref>
    <pgref>522n2</pgref>
    <pgref>657<hi>illus</hi></pgref>
    <subs>
      <ixe><entry>letter to</entry> <pgref>434</pgref></ixe>
      <ixe><entry>letters by</entry> <pgref>432-34</pgref></ixe>
    </subs>
  </ixe>
  ...
</bobIndex>
The elements in question can be declared thus:
< 51 Specialized Elements for Indices > ≡
<![ %L3; [
<!ELEMENT bobIndex      %om.RR;  ((%m.Incl;)*, 
                                    (head, (%m.Incl;)*)*, 
                                    ((bobIndex, (%m.Incl;)*)+ 
                                    | (ixe, (%m.Incl;)*)+))    >
<!ATTLIST bobIndex               %a.global;                     >
<!ELEMENT ixe           %om.RR;  (entry,
                                    (%m.ixref; | %m.Incl;)*, 
                                    subs?)   >
<!ATTLIST ixe                    %a.global;
          type                   CDATA               #IMPLIED   >
<!ELEMENT entry         %om.RR;  %paraContent;                  >
<!ATTLIST entry                  %a.global;                     >
<!ELEMENT docref        %om.RR;  %paraContent;                  >
<!ATTLIST docref                 %a.global;                     >
<!ELEMENT pgref         %om.RR;  %paraContent;                  >
<!ATTLIST pgref                  %a.global;                     >
<!ELEMENT docptr        %om.RO;  EMPTY                          >
<!ATTLIST docptr                 %a.global;                     >
<!ELEMENT pgptr         %om.RO;  EMPTY                          >
<!ATTLIST pgptr                  %a.global;                     >
<!ELEMENT subs          %om.RO;  ((ixe, (%m.Incl;)*)+)          >
<!ATTLIST subs                   %a.global;                     >
]]>

This code is used in < mep.dtd: MEP extensions.dtd file (archival form) 3 > < mepdc.dtd: MEP extensions.dtd file (data-capture form) 4 >

The new element-class ixref is declared thus:
< 52 Index references [continues 37 Define new element classes (archival and data-capture forms)] > ≡
<!ENTITY % x.ixref "" >
<!ENTITY % m.ixref "%x.ixref; docref | pgref | docptr | pgptr"  >



6. Phrase-Level Elements

6.1. Placenames and Other Specialized Names

We define a number of specialized name and reference elements:
  • <place>
  • <person>
  • <org>
  • <ship>
  • <placeRef>
  • <personRef>
  • <orgRef>
  • <shipRef>
The formal declarations are these:
< 53 Miscellaneous phrases (archival and data-capture forms) > ≡
<![ %L1; [
<!ELEMENT place         %om.RR;  %paraContent;                  >
<!ATTLIST place                  %a.global;
                                 %a.names;
          type                   CDATA              #IMPLIED
          teiForm                CDATA              'name'     >
]]>
<![ %L3; [
<!ELEMENT person        %om.RR;  %paraContent;                  >
<!ATTLIST person                 %a.global;
                                 %a.names;
          type                   CDATA              #IMPLIED
          teiForm                CDATA              'name'     >
<!ELEMENT org           %om.RR;  %paraContent;                  >
<!ATTLIST org                    %a.global;
                                 %a.names;
          type                   CDATA              #IMPLIED
          teiForm                CDATA              'name'     >
<!ELEMENT ship          %om.RR;  %paraContent;                  >
<!ATTLIST ship                   %a.global;
                                 %a.names;
          type                   CDATA              #IMPLIED
          teiForm                CDATA              'name'     >
<!ELEMENT placeRef      %om.RR;  %paraContent;                  >
<!ATTLIST placeRef               %a.global;
                                 %a.names;
          type                   CDATA              #IMPLIED
          teiForm                CDATA              'rs'       >
<!ELEMENT personRef     %om.RR;  %paraContent;                  >
<!ATTLIST personRef              %a.global;
                                 %a.names;
          type                   CDATA              #IMPLIED
          teiForm                CDATA              'rs'       >
<!ELEMENT orgRef        %om.RR;  %paraContent;                  >
<!ATTLIST orgRef                 %a.global;
                                 %a.names;
          type                   CDATA              #IMPLIED
          teiForm                CDATA              'rs'       >
<!ELEMENT shipRef       %om.RR;  %paraContent;                  >
<!ATTLIST shipRef                %a.global;
                                 %a.names;
          type                   CDATA              #IMPLIED
          teiForm                CDATA              'rs'       >
]]>

This code is used in < mep.dtd: MEP extensions.dtd file (archival form) 3 > < mepdc.dtd: MEP extensions.dtd file (data-capture form) 4 >

All of these go in level 3 because all require intelligent tagging; <place> goes in level 1, because it's required by targets. (We tried moving it to level 3, and the level 1 DTD cannot be generated.)
Now we need to knit them into the element class system. While we are modifying TEI classes, we also remove <signed> and <salute> from the divtop and divbot classes, and add them instead to the class of components. The same for <dateline>. This corrects what MEP regards as a flaw in the TEI class definitions. At the same time, we'll add an element <closing> to the same classes: this is synonymous with <salute> but intended for closing salutations; MEP will restrict <salute> to opening salutations. We also insert <text> into the class common; its omission is a TEI bug.[5]
< 54 Modify existing element classes (archival and data capture forms) > ≡
<!ENTITY % x.data      'ident | code | kw
| addressee | sender | auth
| place | person | org | ship
| placeRef | personRef | orgRef | shipRef |'
>
<!ENTITY % x.biblpart  'idno |'                             >
<!ENTITY % x.inter     'eg | ps | signed | salute | closing
| dateline | what | enclosure | attestation | endorsement
| docketing |'                                              >
<!ENTITY % x.common    'eg | ps | signed | salute | closing
| dateline | text | what | enclosure | attestation | endorsement
| docketing |'                                              >
<!ENTITY % x.refsys    'plink |'                                >
<!ENTITY % x.chunk     'bobIndex |'                             >
<!ENTITY % x.divbot ''                                          >
<!ENTITY % m.divbot '%x.divbot; address | byline | closer |
           epigraph | trailer'                                     >
<!ENTITY % x.divtop ''                                       >
<!ENTITY % m.divtop '%x.divtop; address | argument | byline |
           docAuthor | docDate | epigraph |
           head | opener'                                       >


<!ENTITY % a.linking '
          corresp            IDREFS              #IMPLIED
          next               IDREF               #IMPLIED
          prev               IDREF               #IMPLIED'      >

This code is used in < mep.ent: MEP extensions.ent file (archival form) 1 > < mepdc.ent: MEP extensions.ent file 2 >

6.2. Typographical Highlighting

For convenience in data capture, we define several elements which are synonyms for <hi> with appropriate attribute values. Their advantage is that they don't require the transcriber or editor to edit the element's attribute values, which is less convenient in some systems than editing based on generic identifiers alone.
< 55 Typographic highlighting (data-capture) > ≡
<![ %L1; [
<!ELEMENT uLine         %om.RR;  %paraContent;                      >
<!ATTLIST uLine
          id                 ID                  #IMPLIED
          n                  CDATA               #IMPLIED
          lang               IDREF               %INHERITED;
          rend               CDATA               #IMPLIED
          TEIform            CDATA               'hi'           >
<!ELEMENT ital          %om.RR;  %paraContent;                  >
<!ATTLIST ital
          id                 ID                  #IMPLIED
          n                  CDATA               #IMPLIED
          lang               IDREF               %INHERITED;
          rend               CDATA               #IMPLIED
          TEIform            CDATA               'hi'           >
<!ELEMENT bold          %om.RR;  %paraContent;                  >
<!ATTLIST bold
          id                 ID                  #IMPLIED
          n                  CDATA               #IMPLIED
          lang               IDREF               %INHERITED;
          rend               CDATA               #IMPLIED
          TEIform            CDATA               'hi'           >
<!ELEMENT super         %om.RR;  %paraContent;                  >
<!ATTLIST super
          id                 ID                  #IMPLIED
          n                  CDATA               #IMPLIED
          lang               IDREF               %INHERITED;
          rend               CDATA               #IMPLIED
          TEIform            CDATA               'hi'           >
<!ELEMENT subscr        %om.RR;  %paraContent;                  >
<!ATTLIST subscr
          id                 ID                  #IMPLIED
          n                  CDATA               #IMPLIED
          lang               IDREF               %INHERITED;
          rend               CDATA               #IMPLIED
          TEIform            CDATA               'hi'           >
<!ELEMENT sCap          %om.RR;  %paraContent;                  >
<!ATTLIST sCap
          id                 ID                  #IMPLIED
          n                  CDATA               #IMPLIED
          lang               IDREF               %INHERITED;
          rend               CDATA               #IMPLIED
          TEIform            CDATA               'hi'           >
]]>

This code is used in < mepdc.dtd: MEP extensions.dtd file (data-capture form) 4 >

These need to be fitted into the element class system.
< 56 Highlighting class (data capture form) > ≡
<!ENTITY % x.highlights '' >
<!ENTITY % m.highlights '%x.highlights; uLine | ital | bold
| super | subscr | sCap' >
<!ENTITY % x.hqphrase '%m.highlights; |' >

This code is used in < mepdc.ent: MEP extensions.ent file 2 >

6.3. Other

Special roles (sender, addressee, etc.) should always be tagged in the heading of a document; in this version of this document, this is done by using <sender> etc. as phrase-level elements and tagging the names where they occur, within a <head> element.
<sender>
the sender of a letter. [It is not clear whether this should be a chunk-level element within the editorial front matter, or a phrase-level element within a <head> element. Ditto for <addressee> and <auth>.]
<addressee>
the intended recipient of a letter or document
<auth>
(i.e. authority) contains phrases like “by order of ...”
The formal declarations are these:
< 57 Phrase-level elements for sender, etc. (archival, data capture) > ≡
<![ %L3; [
<!ELEMENT sender        %om.RR;  %paraContent;                  >
<!ATTLIST sender                 %a.global;                     >
<!ELEMENT addressee     %om.RR;  %paraContent;                  >
<!ATTLIST addressee              %a.global;                     >
<!ELEMENT auth          %om.RR;  %paraContent;                  >
<!ATTLIST auth                   %a.global;                     >
]]>

This code is used in < mep.dtd: MEP extensions.dtd file (archival form) 3 > < mepdc.dtd: MEP extensions.dtd file (data-capture form) 4 >

Experience shows that the <pb> element is needed at the beginning of a document, to show where that document begins in the source edition, even if the document does not begin at the top of a page (and therefore even if there is, strictly speaking, no page break at the beginning of the document). David Chesnutt suggests adding a <pagenum> or even a <page> element to carry the information; this would be cleaner, but would complicate the operations necessary to decide what page of the edition something is on: one could not just scan left for the next <pb>, but would have to look also for <pagenum> elements.
So we simply (mis)use <pb> for this purpose. In principle, we will add rend="nobreak" to indicate the <pb> elements which are unclean in this way. It may be a dirty hack, but it will get the job done.
An even dirtier hack is apparently needed to make page-breaks point to targets with images of the page: we add a <plink> element which is completely synonymous with <pb> but uses a different style, solely in order to get around the fact that DynaText and DynaWeb handle any element with a script property as a hyperlink (even if the actual action is conditional on some attribute having an appropriate value). This element is declared only in the data-capture DTD: it has no business in the archival DTD. (And not much in the data-capture DTD: it belongs in a separate delivery DTD.)
< 58 Define page-link element > ≡
<![ %L3; [
<!ELEMENT plink         %om.RO;  EMPTY                          >
<!ATTLIST plink                  %a.global;
          target                 IDREF               #IMPLIED   >
]]>

This code is used in < mepdc.dtd: MEP extensions.dtd file (data-capture form) 4 >

7. Housekeeping

This section contains some material that has no good place to go, including fixes to problems in the current TEI DTD.

7.1. Bibliographic References

We need to allow quotations inside bibliographic references. Actually, bibliographic references ought to all all phrase-level elements to occur inside them. For various reasons (mostly involving <note>), this doesn't work as simply as it ought. For now, we use a quick and ugly hack. We need to allow quotations, so we just add <q> to the content model. [Bleagh.]
< 59 Bibliographic references (archival, data capture) > ≡
<![ %L3; [
<!ELEMENT %n.bibl;      %om.RO;  (#PCDATA | %m.phrase; | q |
                                 %m.biblPart; | %m.Incl;)*       >
<!ATTLIST %n.bibl;               %a.global;
                                 %a.declarable;
          TEIform                CDATA               'bibl'     >
]]>

This code is used in < mep.dtd: MEP extensions.dtd file (archival form) 3 > < mepdc.dtd: MEP extensions.dtd file (data-capture form) 4 >

7.2. Text as Inter-level Element

A bug in the TEI DTD (not yet reported, as far as I know) causes the class common not to include everything it should. Among the missing is <text>.

7.3. Selection of TEI Elements

The selection of elements in this version of the MEP DTD is based on that in TEI Lite, though some changes have been made, and others may be made in future. This means that the documentation for TEI Lite may be usefully consulted by users of the MEP DTD.
The selection of elements is probably best considered one tag set at a time. We distinguish elements we are suppressing entirely (for which we define a parameter entity with the value IGNORE) from those we are suppressing merely in order that we can provide a new definition. For the latter, we also define a parameter entity which expands to the keyword IGNORE, but instead of the literal string ignore we use a reference to the parameter entity REDEFINE. From SGML's point of view, it's the same thing.
Here goes.
< 60 Preliminaries for TEI.extensions.ent file (archival and data capture forms) > ≡
<!ENTITY % REDEFINE 'IGNORE' >

This code is used in < mep.ent: MEP extensions.ent file (archival form) 1 > < mepdc.ent: MEP extensions.ent file 2 >

< 61 Select TEI elements (archival and data capture forms) > ≡
{Select tags from TEI driver file (archival and data-capture) 62}


<!-- ******************************************************** -->
<!-- I.  Core tag sets.                                       -->
<!-- ******************************************************** -->

<!-- Chapter 5:  TEI Header ********************************* -->
{Select tags from TEI header (A & DC) 63}

<!-- Chapter 6:  Elements Available in All TEI Documents **** -->
{Select tags from TEI core tag set (A & DC) 64}

<!-- Chapter 7:  Default Text Structure ********************* -->
{Select tags from default text structure (A & DC) 65}


<!-- ******************************************************** -->
<!-- II.  Base tag sets.                                      -->
<!-- II.A.  DTD files                                         -->
<!-- ******************************************************** -->

<!-- Chapter 8:  Prose * (included) ************************* -->
<!-- File:  TEIPROS2.DTD (no tags) ************************** -->
<!-- Chapter 9:  Verse * (excluded) ************************* -->
<!-- Chapter 10:  Drama * (excluded) ************************ -->
<!-- Chapter 11:  Transcriptions of Speech * (excluded) ***** -->
<!-- Chapter 12:  Print Dictionaries * (excluded) *********** -->
<!-- Chapter 13:  Terminological Data * (excluded) ********** -->
<!-- * Mixed Bases * (excluded) ***************************** -->

<!-- ******************************************************** -->
<!-- III.  Additional tag sets.                               -->
<!-- ******************************************************** -->

<!-- Chapter 14:  Linking, Segmentation, and Alignment ****** -->
{Select tags from tag set for linking and alignment (A & DC) 66}

<!-- Chapter 15:  Simple Analytic Mechanisms **************** -->
{Select tags from tag set for simple analysis (A & DC) 67}

<!-- Chapter 16:  Feature Structures * (excluded) *********** -->
<!-- Chapter 17:  Certainty and Responsibility * (excluded) * -->
<!-- Chapter 18:  Transcription of Primary Sources ********** -->
{Select tags from tag set for transcription of MSS (A & DC) 68}

<!-- Chapter 19:  Critical Apparatus * (excluded) *********** -->
<!-- Chapter 20:  Names and Dates * (excluded) ************** -->
<!-- Chapter 21:  Graphs, Networks, and Trees * (excluded) ** -->
<!-- Chapter 22:  Tables, Formulae, and Graphics ************ -->
{Select tags from tag set for tables and figures (A & DC) 70}

<!-- Chapter 23:  Language Corpora * (excluded) ************* -->

This code is used in < mep.ent: MEP extensions.ent file (archival form) 1 > < mepdc.ent: MEP extensions.ent file 2 >

In the main TEI driver file, we select only the <tei.2> element, suppressing <teiCorpus.2>:
< 62 Select tags from TEI driver file (archival and data-capture) > ≡
<!-- FILE:  TEI2.DTD -->
<!ENTITY % TEI.2                  '%L3;'   >
<!ENTITY % teiCorpus.2                 'IGNORE' >

This code is used in < Select TEI elements (archival and data capture forms) 61 >

In the header,
< 63 Select tags from TEI header (A & DC) > ≡
<!-- File:  TEIHDR2.DTD -->
<!ENTITY % teiHeader              '%L3;' >
<!ENTITY % fileDesc               '%L3;' >
<!ENTITY % titleStmt              '%L3;' >
<!ENTITY % sponsor                '%L3;' > <!-- ? -->
<!ENTITY % funder                 '%L3;' > <!-- ? -->
<!ENTITY % principal              '%L3;' > <!-- ? -->
<!ENTITY % editionStmt            '%L3;' > <!-- ? -->
<!ENTITY % edition                '%L3;' > <!-- ? -->
<!ENTITY % extent                 '%L1;' >
<!ENTITY % publicationStmt        '%L3;' >
<!ENTITY % distributor            '%L3;' >
<!ENTITY % authority              '%L3;' >
<!ENTITY % idno         '%L1;' >
<!ENTITY % availability           '%L3;' > <!-- ? -->
<!ENTITY % seriesStmt             '%L3;' >
<!ENTITY % notesStmt              '%L3;' >
<!ENTITY % sourceDesc   '%L1;' >
<!ENTITY % scriptStmt                  'IGNORE' >
<!ENTITY % recordingStmt               'IGNORE' >
<!ENTITY % recording                   'IGNORE' >
<!ENTITY % equipment                   'IGNORE' >
<!ENTITY % broadcast                   'IGNORE' >
<!ENTITY % encodingDesc           '%L3;' >
<!ENTITY % projectDesc            '%L3;' >
<!ENTITY % samplingDecl           '%L3;' >
<!ENTITY % editorialDecl          '%L3;' >
<!ENTITY % correction             '%L3;' > <!-- ? -->
<!ENTITY % normalization          '%L3;' > <!-- ? -->
<!ENTITY % quotation              '%L3;' > <!-- ? -->
<!ENTITY % hyphenation            '%L3;' > <!-- ? -->
<!ENTITY % segmentation           '%L3;' > <!-- ? -->
<!ENTITY % stdVals                '%L3;' > <!-- ? -->
<!ENTITY % interpretation         '%L3;' > <!-- ? -->
<!ENTITY % tagsDecl               '%L3;' >
<!ENTITY % tagUsage               '%L3;' >
<!ENTITY % rendition              '%L3;' >
<!ENTITY % refsDecl               '%L3;' >
<!ENTITY % step                        'IGNORE' > <!-- ? -->
<!ENTITY % state                       'IGNORE' >
<!ENTITY % classDecl              '%L3;' >
<!ENTITY % taxonomy               '%L3;' >
<!ENTITY % category               '%L3;' >
<!ENTITY % catDesc                '%L3;' >
<!ENTITY % fsdDecl                     'IGNORE' >
<!ENTITY % metDecl                     'IGNORE' >
<!ENTITY % symbol                      'IGNORE' >
<!ENTITY % variantEncoding             'IGNORE' >
<!ENTITY % profileDesc            '%L3;' >
<!ENTITY % creation               '%L3;' >
<!ENTITY % langUsage              '%L3;' >
<!ENTITY % language               '%L3;' >
<!ENTITY % textClass              '%L3;' >
<!ENTITY % keywords               '%L3;' >
<!ENTITY % classCode              '%L3;' >
<!ENTITY % catRef                 '%L3;' >
<!ENTITY % revisionDesc           '%L3;' >
<!ENTITY % change                 '%L3;' >

This code is used in < Select TEI elements (archival and data capture forms) 61 >

In the TEI core,
< 64 Select tags from TEI core tag set (A & DC) > ≡
<!-- File:  TEICORE2.DTD -->
<!ENTITY % p                                       '%REDEFINE;' >
<!ENTITY % foreign                '%L3;' >
<!ENTITY % emph                   '%L3;' >
<!ENTITY % hi           '%L1;' >
<!ENTITY % distinct               '%L3;' >
<!ENTITY % q            '%L1;' >
<!ENTITY % quote        '%L1;' >
<!ENTITY % cit                    '%L3;' >
<!ENTITY % soCalled               '%L3;' >
<!ENTITY % term                   '%L3;' >
<!ENTITY % mentioned              '%L3;' >
<!ENTITY % gloss                  '%L3;' >
<!ENTITY % name                   '%L3;' >
<!ENTITY % rs                     '%L3;' >
<!ENTITY % num                    '%L3;' >
<!ENTITY % measure                     'IGNORE' >
<!ENTITY % date                   '%L1;' >
<!ENTITY % dateRange              '%L1;' >
<!ENTITY % time                   '%L3;' >
<!ENTITY % timeRange                   'IGNORE' >
<!ENTITY % abbr                   '%L3;' >
<!ENTITY % expan                       'IGNORE' > <!-- ? -->
<!ENTITY % sic                    '%L3;' >
<!ENTITY % corr                   '%L3;' >
<!ENTITY % reg                    '%L3;' > <!-- ? -->
<!ENTITY % orig                   '%L3;' > <!-- ? -->
<!ENTITY % gap          '%L1;' >
<!ENTITY % add          '%L1;' > <!-- not L3 19990605 -->
<!ENTITY % del          '%L1;' > <!-- not L3 19990605 -->
<!ENTITY % unclear                '%L3;' >
<!ENTITY % address      '%L1;' >
<!ENTITY % addrLine     '%L1;' >
<!ENTITY % street                      'IGNORE' >
<!ENTITY % postCode                    'IGNORE' >
<!ENTITY % postBox                     'IGNORE' >
<!ENTITY % ptr               '%L2;' >
<!ENTITY % ref               '%L2;' >
<!ENTITY % list         '%L1;' >
<!ENTITY % item         '%L1;' >
<!ENTITY % label        '%L1;' >
<!ENTITY % head         '%L1;' >
<!ENTITY % headLabel                   'IGNORE' >
<!ENTITY % headItem                    'IGNORE' >
<!ENTITY % note         '%L1;' >
<!ENTITY % index                  '%L3;' >
<!ENTITY % divGen                 '%L3;' >
<!ENTITY % milestone    '%L1;' >
<!ENTITY % pb           '%L1;' >
<!ENTITY % lb           '%L1;' >
<!ENTITY % cb                          'IGNORE' >
<!ENTITY % bibl                                    '%REDEFINE;'>
<!ENTITY % biblStruct                  'IGNORE' >
<!ENTITY % biblFull               '%L3;' >
<!ENTITY % listBibl               '%L3;' >
<!ENTITY % analytic                    'IGNORE' >
<!ENTITY % monogr                      'IGNORE' >
<!ENTITY % series                      'IGNORE' >
<!ENTITY % author                 '%L3;' >
<!ENTITY % editor                 '%L3;' >
<!ENTITY % respStmt               '%L3;' >
<!ENTITY % resp                   '%L3;' >
<!ENTITY % title                  '%L1;' >
<!ENTITY % meeting                     'IGNORE' > <!-- ? -->
<!ENTITY % imprint                '%L3;' >
<!ENTITY % publisher              '%L3;' >
<!ENTITY % biblScope              '%L3;' >
<!ENTITY % pubPlace               '%L3;' >
<!ENTITY % l            '%L1;' >
<!ENTITY % lg           '%L1;' >
<!ENTITY % sp                     '%L3;' >
<!ENTITY % speaker                '%L3;' >
<!ENTITY % stage                  '%L3;' >

This code is used in < Select TEI elements (archival and data capture forms) 61 >

< 65 Select tags from default text structure (A & DC) > ≡
<!-- File:  TEISTR2.DTD -->
<!ENTITY % text         '%L1;' >
<!ENTITY % body         '%L1;' >
<!ENTITY % group                                   '%REDEFINE;' >
<!ENTITY % div          '%L1;' >
<!ENTITY % div0         '%L1;' >
<!ENTITY % div1         '%L1;' >
<!ENTITY % div2         '%L1;' >
<!ENTITY % div3         '%L1;' >
<!ENTITY % div4         '%L1;' >
<!ENTITY % div5         '%L1;' >
<!ENTITY % div6         '%L1;' >
<!ENTITY % div7         '%L1;' >
<!ENTITY % trailer                '%L3;' >
<!ENTITY % byline                 '%L3;' >
<!ENTITY % dateline                                '%REDEFINE;' >
<!ENTITY % argument               '%L3;' >
<!ENTITY % epigraph               '%L3;' >
<!ENTITY % opener                 '%L3;' >
<!ENTITY % closer                 '%L3;' >
<!ENTITY % salute       '%L1;' >
<!ENTITY % signed       '%L1;' >

<!-- File:  TEIFRON2.DTD -->
<!ENTITY % front                  '%L3;' >
<!ENTITY % titlePage              '%L3;' >
<!ENTITY % docTitle               '%L3;' >
<!ENTITY % titlePart              '%L3;' >
<!ENTITY % docAuthor              '%L3;' >
<!ENTITY % imprimatur                  'IGNORE' > <!-- ? -->
<!ENTITY % docEdition             '%L3;' >
<!ENTITY % docImprint             '%L3;' >
<!ENTITY % docDate                '%L3;' >

<!-- File:  TEIBACK2.DTD -->
<!ENTITY % back         '%L1;' >


This code is used in < Select TEI elements (archival and data capture forms) 61 >

< 66 Select tags from tag set for linking and alignment (A & DC) > ≡
<!-- File:  TEILINK2.ENT -->
<!-- File:  TEILINK2.DTD -->
<!ENTITY % link                        'IGNORE' > <!-- ? -->
<!ENTITY % linkGrp                     'IGNORE' > <!-- ? -->
<!ENTITY % xref              '%L2;' >
<!ENTITY % xptr              '%L2;' >
<!ENTITY % seg                                     '%REDEFINE;' >
<!ENTITY % anchor            '%L2;' >
<!ENTITY % when                        'IGNORE' >
<!ENTITY % timeline                    'IGNORE' >
<!ENTITY % join                        'IGNORE' >
<!ENTITY % joinGrp                     'IGNORE' >
<!ENTITY % alt                         'IGNORE' >
<!ENTITY % altGrp                      'IGNORE' >

This code is used in < Select TEI elements (archival and data capture forms) 61 >

< 67 Select tags from tag set for simple analysis (A & DC) > ≡
<!-- File:  TEIANA2.ENT -->
<!-- File:  TEIANA2.DTD -->
<!ENTITY % span                        'IGNORE' > <!-- ? -->
<!ENTITY % spanGrp                     'IGNORE' >
<!ENTITY % interp                      'IGNORE' >
<!ENTITY % interpGrp                   'IGNORE' >
<!ENTITY % s                      '%L3;' >
<!ENTITY % cl                          'IGNORE' >
<!ENTITY % phr                         'IGNORE' >
<!ENTITY % w                           'IGNORE' >
<!ENTITY % m                           'IGNORE' >
<!ENTITY % c                           'IGNORE' >

This code is used in < Select TEI elements (archival and data capture forms) 61 >

In the tag set for transcription, some items are level 1 and some are level 3. For the <restore> element, we need to supply a definition for the a.readings parameter entity.[6]
< 68 Select tags from tag set for transcription of MSS (A & DC) > ≡
<!-- default:  include all tags from transcription tag set -->
<!ENTITY % addSpan      '%L1;' >
<!ENTITY % delSpan      '%L1;' >
<!ENTITY % restore                '%L3;' >
<!ENTITY % supplied               '%L3;' >
<!ENTITY % hand                   '%L3;' >
<!ENTITY % handShift              '%L3;' >
<!ENTITY % handList               '%L3;' >
<!ENTITY % damage       '%L1;' >
<!ENTITY % space                  '%L3;' >
<!ENTITY % fw                     '%L3;' >

<!ENTITY % INHERITED '#IMPLIED'>
<!ENTITY % a.readings '
      wit CDATA #IMPLIED
      type CDATA #IMPLIED
      cause CDATA #IMPLIED
      varSeq CDATA #IMPLIED
      resp CDATA %INHERITED;
      hand IDREF %INHERITED;'> 

This code is used in < Select TEI elements (archival and data capture forms) 61 >

The tag set for text-critical apparatus turns out not to be as useful as originally thought, so it has (as of version 2.0 of the DTD) been suppressed.
< 69 Select tags from tag set for text-critical apparatus (A & DC) > ≡
<!-- default:  include all tags from text-critical tag set -->
<!ENTITY % app          'INCLUDE' >
<!ENTITY % lem          'INCLUDE' >
<!ENTITY % rdg          'INCLUDE' >
<!ENTITY % rdgGrp       'INCLUDE' >
<!ENTITY % witDetail    'INCLUDE' >
<!ENTITY % wit          'INCLUDE' >
<!ENTITY % witList      'INCLUDE' >
<!ENTITY % witness      'INCLUDE' >
<!ENTITY % witStart     'INCLUDE' >
<!ENTITY % witEnd       'INCLUDE' >
<!ENTITY % lacunaStart  'INCLUDE' >
<!ENTITY % lacunaEnd    'INCLUDE' >

This code is not used elsewhere.

< 70 Select tags from tag set for tables and figures (A & DC) > ≡
<!-- File:  TEIFIG2.ENT -->
<!ENTITY % formulaNotations 'CDATA'                             >
<!ENTITY % formulaContent '(#PCDATA)'                           >

<!-- File:  TEIFIG2.DTD -->
<!ENTITY % table        '%L1;' >
<!ENTITY % row          '%L1;' >
<!ENTITY % cell         '%L1;' >
<!ENTITY % formula      '%L1;' >
<!ENTITY % figure       '%L1;' >
<!ENTITY % figDesc      '%L1;' >

This code is used in < Select TEI elements (archival and data capture forms) 61 >


A. Orphaned declarations

These declarations must belong somewhere, I just haven't found the place yet.

B. An Extended Example with Multiple Enclosures

As an example of a relatively complex document, we encode here a letter from Henry Laurens to his son John Laurens, enclosing copies of a correspondence Henry had had with Governor Lord William Campbell and with Campbell's secretary, Alexander Innes. The document appears on pp. 319-335 of volume 10 of the Papers of Henry Laurens. [David - is this the right volume number?]
The overall structure of the document is the usual:

<document>
<teiHeader> ... </teiHeader>
<edFront> ... </edFront>
<text> ... </text>
<edBack> ... </edBack>
</document>

or, in the data-capture format:

<document>
<teiHeader> ... </teiHeader>
[miscellaneous editorial front matter ... ]
<docBody> ... </docBody>
[miscellaneous editorial back matter ... ]
</document>

In what follows, only the data-capture form will be given.
The editorial front matter is simple:

<head>To <addressee>John Laurens</addressee></head>
<headnote>In the letter below, HL mentioned ...
During the intervening years the original missive
to John disappeared but the letter has been preserved
in an HL letterbook.  The enclosure, detached
from the letter, was divided into two sections.
</headnote>
<dateline>Charles Town, August 20, 1775</dateline>

[complete this example.]

C. Open Questions

On a number of questions, this version of this document reflects arbitrary decisions made in order to have a functioning DTD for use in experimentation, which need to be considered with more care. This section lists some of these questions. [It should list all of them, but until we have a working DTD it will be hard to walk through the markup scheme slowly and systematically enough to be sure we have got all the questions.]
Open questions include:
  • What elements are needed as possible content for the editorial front and back matter, and for the beginning, middle, and end of letters?
  • Should senders and addressees be identified by chunk-level elements within editorial front matter, or by phrase-level elements within headings? (discussion below)
  • When information left implicit in an edition is made explicit in the electronic form (e.g. when the sender is explicitly identified as Henry Laurens), how should the information be tagged?
  • Should the MEP DTD select the additional tag set for detailed markup of names and dates?
Some of these are discussed in the following sections.
To-do list:
  • ensure that all elements have proper teiForm attribute

Senders and Addressees

How should sender (for letters), document-author (for non-letters), addressee, and authority be treated? Should they be free-standing components within editorial front matter? Or should they be phrase-level elements within <head> elements?
Concretely, the first approach (free-standing components) would result in editorial front matter like this:

<docAuthor>James Madison<docAuthor>
<docTitle>Advice on Executing the Residence Act</docTitle>


<sender>The President</sender>
<addressee>The Secretary of State</addressee>


<addressee>John Laurens</addressee>

and the rendering engine would be responsible for displaying these with the proper connective prepositions, as

<head>James Madison's Advice on Executing the Residence Act</head>
 ...
<head>The President to the Secretary of State</head>
 ...
<head rend='caps'>To John Laurens</head>

This approach has the advantage of simplifying the task of search engines which must find documents written by certain individuals, or addressed to them, because it makes (in this respect) the editorial front matter more like a simple database record with standard fields. It has the disadvantage of requiring the rendering software to perform rather sophisticated transformations on the information, such as supplying the apostrophe-plus-s in the first example, or providing the to and lowercasing the second the in the second example. (The third example is less strenuous, because the Laurens Papers tend to use simpler headings.)
The second approach attempts to preserve the details of existing headings, and give editors complete control over the formulation of headings in new editions, by making the encoding less database-like and more text-like. Retrieval is complicated by the increased complexity of the markup, but individuals (or can be) identified as senders, addressees, etc. within the free text of the heading:

<head><docAuthor>James Madison<docAuthor>'s
Advice on Executing the Residence Act</head>


<head>
<sender>The President</sender>
to <addressee>the Secretary of State</addressee>
</head>


<head>To <addressee>John Laurens</addressee></head>

The chief disadvantage of this second approach is that it requires more complicated tagging. For purposes of data entry, therefore, it might be preferable to omit the identification of individual roles, and tag the headings more simply:

<head>James Madison's Advice on Executing the Residence Act</head>
 ...
<head>The President to the Secretary of State</head>
 ...
<head rend='caps'>To John Laurens</head>

Ad hoc heading-analysis software could be written to ‘parse’ headings like these into their component parts, though it would be wise to check all results carefully, and manual intervention would be necessary on at least some headings.
[On balance, it seems that the second approach is preferable for both legacy data and the preparation of new materials, and the DTD in this version of this document is defined to use this type of tagging.]

Normalized and Implicit Information

In the Laurens Papers, the heading of a document gives both sender and addressee only when neither is Laurens himself. This poses no serious problems in print, and only small problems for retrieval in an electronic Laurens Papers searched in isolation. Searches for documents written by Henry Laurens simply become searches for documents without <sender> or <docAuthor> elements. When several editions are to be searched at once, however, this useful concision leads to ambiguity.
To avoid the ambiguity and allow easier cross-edition searching, it would be useful to supply explicitly information like

<sender>Henry Laurens</sender>

which is implicit in the edition as it stands. To preserve the normal display of the edition, however, it is necessary to indicate that this <sender> element is not to be displayed.
How is this best accomplished?

Notes

[1] SGML tag omission would be used, but yacc parsers are able to recognize structure more simply and more reliably than SGML parsers relying on tag omission, except perhaps in the case of certain virtuosos of SGML tag omission, who may find omittag and shortref more useful than the current authors.
[2] The Papers of Thomas Jefferson Volume 19 24 January to 31 March 1791, ed. Julian P. Boyd, Ruth W. Lester, Assistant Editor (Princeton: Princeton University Press, 1974). Further citations to the Jefferson Papers are all to this volume. The actual ‘table of contents’ to the volume is an alphabetical list of correspondents and topics which resembles an abbreviated index.
[3] The Papers of Henry Laurens Volume 11 Jan. 5, 1776 - Nov. 1, 1777, ed. David R. Chesnutt et al. (Columbia, S. C.: for the South Carolina Historical Society by the University of South Carolina Press, 1988).
[4] The example is drawn from Mark Twain's Letters, Vol. 5, 1872-1873, ed. Lin Salamo and Harriet Elinor Smith (Berkeley: University of California Press, 1997), p. 887.
[5] These modifications were specified in relation to the P3 DTDs; they are still necessary with the P4 DTDs.
[6] This appears to be a glitch in the P4 beta TEI driver file; after it's fixed, the definition of a.readings can go away.