Skip to content
Permalink
9a7d6419a0
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Go to file
 
 
Cannot retrieve contributors at this time
88 lines (62 sloc) 2.63 KB
# Validation of Metadata Documents
## Purpose
After you generated a new *Process* file, it is desirable to check that the document
adheres to the DEEP *Process* specification that is defined in the XML schema file `template/deep_process_schema.xsd`.
To achieve that with little to no extra effort, you can run the small Python validation script `mdvalid.py`.
## Installation
The script itself requires no installation or configuration and works with Python2.7+ or Python3.2+.
It uses the *lxml* package that is not part of the Python standard library.
Please install [lxml](http://lxml.de/installation.html) according to your software environment.
For your convenience, we provide two YAML files to install a complete
environment for Python2 or Python3 (whichever you prefer)
using the [Conda](http://conda.pydata.org/miniconda.html) package manager.
---
#### Attention
Note from: 2016-12-29
Due to some dependency issues concerning the *lxml* package and Conda, *lxml* currently it has to be installed
from within the activated Conda environment. That is why, in the Conda YAML files,
the line `- lxml=3.*` is commented out - see also here: [Github issue 4093](https://github.com/conda/conda/issues/4093)
---
In case you want to use the Conda package manager, you can create the environment as follows:
```bash
$ conda env create -f conda_py3_mdvalid.yml
## wait for setup to complete
$ source activate py3valid
## next line: workaround solution - 2016-12-29
$ pip install lxml
## wait for installation to complete
## run the script; see below
$ source deactivate
```
## Execution
The script can be executed on a command line as follows:
```bash
$ python mdvalid.py [Options]
```
Please read the help for details:
```bash
$ python mdvalid.py --help
```
The Python code has been developed on a Debian Linux system:
* x86_64 Linux (Debian 7.5 "Wheezy")
with Python versions
* Python 2.7.3 (lxml.etree 2.3.2)
* Python 3.2.3 (lxml.etree 2.3.2)
Note that this testing setup refers to the released version. The current version of *lxml* 3.6.4 works as well.
## Examples:
Validate a process XML:
```bash
$ python mdvalid.py --schema schema_file.xsd --process process_file.xml
```
Validate a process XML and a corresponding analysis metadata file:
```bash
$ python mdvalid.py --schema schema_file.xsd --process process_file.xml --analysis analysis_file.amd.tsv
```
Validate many analyses of the same type:
```bash
$ python mdvalid.py --schema schema_file.xsd --process process_file.xml --analysis file1.amd.tsv file2.amd.tsv file3.amd.tsv
```
In case of errors, you can get more info by running the script in debug mode
```bash
$ python mdvalid.py --debug [Options]
```