After you generated a new Process file, it is desirable to check that the document
adheres to the DEEP Process specification that is defined in the XML schema file template/deep_process_schema.xsd
.
To achieve that with little to no extra effort, you can run the small Python validation script mdvalid.py
.
The script itself requires no installation or configuration and works with Python2.7+ or Python3.2+. It uses the lxml package that is not part of the Python standard library. Please install lxml according to your software environment. For your convenience, we provide two YAML files to install a complete environment for Python2 or Python3 (whichever you prefer) using the Conda package manager.
Note from: 2016-12-29
Due to some dependency issues concerning the lxml package and Conda, lxml currently it has to be installed
from within the activated Conda environment. That is why, in the Conda YAML files,
the line - lxml=3.*
is commented out - see also here: Github issue 4093
In case you want to use the Conda package manager, you can create the environment as follows:
$ conda env create -f conda_py3_mdvalid.yml
## wait for setup to complete
$ source activate py3valid
## next line: workaround solution - 2016-12-29
$ pip install lxml
## wait for installation to complete
## run the script; see below
$ source deactivate
The script can be executed on a command line as follows:
$ python mdvalid.py [Options]
Please read the help for details:
$ python mdvalid.py --help
The Python code has been developed on a Debian Linux system:
- x86_64 Linux (Debian 7.5 "Wheezy")
with Python versions
- Python 2.7.3 (lxml.etree 2.3.2)
- Python 3.2.3 (lxml.etree 2.3.2)
Note that this testing setup refers to the released version. The current version of lxml 3.6.4 works as well.
Validate a process XML:
$ python mdvalid.py --schema schema_file.xsd --process process_file.xml
Validate a process XML and a corresponding analysis metadata file:
$ python mdvalid.py --schema schema_file.xsd --process process_file.xml --analysis analysis_file.amd.tsv
Validate many analyses of the same type:
$ python mdvalid.py --schema schema_file.xsd --process process_file.xml --analysis file1.amd.tsv file2.amd.tsv file3.amd.tsv
In case of errors, you can get more info by running the script in debug mode
$ python mdvalid.py --debug [Options]