Permalink
Cannot retrieve contributors at this time
Name already in use
A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
despecs/README.md
Go to fileThis commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
37 lines (25 sloc)
2.41 KB
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Data Entry Specifications for XML texts | |
# Current version | |
The current version, `DESpecs-XML.tex` are specs for creating the body of XML documents according to the TEI P5 Guidelines. The TEI header as well as the facsimile part are left out since they can be attached in post-production, filled with the appropriate metadata. See the `body2tei` script in the `postprocessing` directory. | |
The data entry company uses the attached DTD file (in `schema` directory) for checking. For internal purposes, there is also a Relax NG file (converted from the DTD). A ODD file is not yet available. | |
# Schema | |
## Data entry | |
The schema is maintained in the DTD file `schema/mpiwg-erfassung.dtd`. The only reason for using a DTD is that it is required by the data entry company. Internally, the RelaxNG versions are preferred. | |
The file `mpiwg-erfassung.rnc` is a direct conversion of the DTD file by running | |
trang mpiwg-erfassung.dtd mpiwg-erfassung.rnc | |
With this file, the texts returned by the data entry firm can be checked before performing the postprocessing steps (details below). | |
## Full TEI schema | |
The file `schema/mpiwg_tei_schema.rnc` is using on the RNC file for data entry as a module. However, slight changes have to be made in order to successfully validate postprocessed files: | |
1. The `div` structure has been introduced, that is the body element has only `div` children. | |
2. The `pb` element gets the attribute `facs` for linking purposes. | |
3. The `start` statement has to be removed, since the file is called as a submodule. | |
## Incorporate changes made in the DTD | |
If changes to the main DTD are necessary, the workflow documented in the sections above has to be re-applied. | |
# Scripts | |
Scripts exist for postprocessing | |
## `body2tei` | |
Converts text bodies typed after the conventions of the XML DE Specs to TEI documents. Specifically, it adds the header and facsimile part (using data from index.meta) and introduces a `div` structure in brute-force manner. | |
## `echotei2latex.xsl` | |
This converts a TEI document to LaTeX, a side-by-side view of facsimile and transcription. | |
# Previous version | |
The previous version, `DESpecs.tex` were specs to create an XML-like markup which was lateron converted to proper XML through a workflow. The use of these instructions is discouraged and only kept here for archival purposes. There is also a legacy version for Chinese texts, `DESpecs_chinese.tex`, which would need to be translated to a TEI version, if need be. |