Skip to content
Permalink
f2961c4e48
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Go to file
 
 
Cannot retrieve contributors at this time
88 lines (65 sloc) 3.66 KB
# EOASkripts
This is a set of scripts which forms the central part of the
conversion workflow in the Edition Open Access.
We currently accept and support manuscripts in two different formats:
LaTeX and DocX (as used in Microsoft Word).
![The EOA workflow](doc/eoa_intermediate_workflow.png)
## The LaTeX workflow
The LaTeX workflow is based on a reduced set of LaTeX commands which
are defined in a preambel and help keeping the book production
workflow consistent. A sample project is found at
<https://github.molgen.mpg.de/EditionOpenAccess/eoa_sample_project>.
The PDF version is created directly with `xelatex`.
For the creation of the other format,
[tralics](https://www-sop.inria.fr/marelle/tralics/) is used to
convert the TeX source to XML. The original DocBook output is enriched
by various EOA-specific elements.
This intermediate XML file is subsequently used by three additional
programs which turn it into TEI-XML, EPUB and Django-XML,
respectively. The Django-XML format is ingested into the database of
the EOA site where it will show up as an online publication.
The EPUB files can be put together to form an ebook. The script
[data/misc/epub.sh](https://github.molgen.mpg.de/EditionOpenAccess/EOASkripts/blob/master/data/misc/epub.sh) performs the required steps.
The conversion to TEI is still work in progress.
## The DocX workflow
This workflow is based on Microsoft Word documents which are created
following the Guidelines of a template found at
<http://edition-open-access.de/media/support/files/EOA_Word_Template.docx>.
Currently, the webservice at <http://www.tei-c.org/oxgarage/#> is used
to convert it into TEI P5.
Similar to the LaTeX workflow we require the authors to hand in their
bibliographic references in a database format, such as BibTeX. The
Word template explains in detail how citations should be entered.
The script [fix_tei](https://github.molgen.mpg.de/EditionOpenAccess/EOASkripts/blob/master/fix_tei.py) corrects some artifacts of the oxgarage
conversion and expands the shorthand codes for references and figures
to XML tags.
After that, a PDF document can be obtained by using an XSL script to
create a LaTeX file, or the TEI file can be converted into the
customized DocBook format from above workflow so that the existing
tools can be used.
See [doc/XSL.md](https://github.molgen.mpg.de/EditionOpenAccess/EOASkripts/blob/master/doc/XSL.md) for a documentation of the XSL workflow.
## Examplary workflow
To install the whole toolchain, clone at least this repository as well
as the 'advanced' branch of
[EOA sample project](https://github.molgen.mpg.de/EditionOpenAccess/eoa_sample_project).
Follow the installation instructions in [INSTALL.md](https://github.molgen.mpg.de/EditionOpenAccess/EOASkripts/blob/master/INSTALL.md).
In `eoa_sample_project`, run `xelatex`, `biber` (the version included
in your TeX distribution) and `xelatex` two more times. This will give
you the PDF version of the document.
Next, comment line 9 in `EOASample.tex` (the EOA preambel) and
uncomment line 10 (the XML preambel) and run `eoaconvert.py`:
eoaconvert.py -f EOASample
If everything went well, you can also try and run
tralics2django.py
tralics2epub.py
tralics2tei.py
These scripts don't take any arguments and will produce output in the
`CONVERT` directory.
External dependencies
---------------------
See [INSTALL.md](https://github.molgen.mpg.de/EditionOpenAccess/EOASkripts/blob/master/INSTALL.md) for details.
- lxml (<https://pypi.org/project/lxml/>)
- bibtexparser (<https://pypi.org/project/bibtexparser/>)
- BeautifulSoup (<https://pypi.org/project/bs4/>)
- pandoc (<https://pandoc.org/>)
- pandoc-citeproc (<https://hackage.haskell.org/package/pandoc-citeproc>)