Permalink
Cannot retrieve contributors at this time
Name already in use
A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
comp-metadata/validation/README.md
Go to fileThis commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
88 lines (62 sloc)
2.63 KB
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Validation of Metadata Documents | |
## Purpose | |
After you generated a new *Process* file, it is desirable to check that the document | |
adheres to the DEEP *Process* specification that is defined in the XML schema file `template/deep_process_schema.xsd`. | |
To achieve that with little to no extra effort, you can run the small Python validation script `mdvalid.py`. | |
## Installation | |
The script itself requires no installation or configuration and works with Python2.7+ or Python3.2+. | |
It uses the *lxml* package that is not part of the Python standard library. | |
Please install [lxml](http://lxml.de/installation.html) according to your software environment. | |
For your convenience, we provide two YAML files to install a complete | |
environment for Python2 or Python3 (whichever you prefer) | |
using the [Conda](http://conda.pydata.org/miniconda.html) package manager. | |
--- | |
#### Attention | |
Note from: 2016-12-29 | |
Due to some dependency issues concerning the *lxml* package and Conda, *lxml* currently it has to be installed | |
from within the activated Conda environment. That is why, in the Conda YAML files, | |
the line `- lxml=3.*` is commented out - see also here: [Github issue 4093](https://github.com/conda/conda/issues/4093) | |
--- | |
In case you want to use the Conda package manager, you can create the environment as follows: | |
```bash | |
$ conda env create -f conda_py3_mdvalid.yml | |
## wait for setup to complete | |
$ source activate py3valid | |
## next line: workaround solution - 2016-12-29 | |
$ pip install lxml | |
## wait for installation to complete | |
## run the script; see below | |
$ source deactivate | |
``` | |
## Execution | |
The script can be executed on a command line as follows: | |
```bash | |
$ python mdvalid.py [Options] | |
``` | |
Please read the help for details: | |
```bash | |
$ python mdvalid.py --help | |
``` | |
The Python code has been developed on a Debian Linux system: | |
* x86_64 Linux (Debian 7.5 "Wheezy") | |
with Python versions | |
* Python 2.7.3 (lxml.etree 2.3.2) | |
* Python 3.2.3 (lxml.etree 2.3.2) | |
Note that this testing setup refers to the released version. The current version of *lxml* 3.6.4 works as well. | |
## Examples: | |
Validate a process XML: | |
```bash | |
$ python mdvalid.py --schema schema_file.xsd --process process_file.xml | |
``` | |
Validate a process XML and a corresponding analysis metadata file: | |
```bash | |
$ python mdvalid.py --schema schema_file.xsd --process process_file.xml --analysis analysis_file.amd.tsv | |
``` | |
Validate many analyses of the same type: | |
```bash | |
$ python mdvalid.py --schema schema_file.xsd --process process_file.xml --analysis file1.amd.tsv file2.amd.tsv file3.amd.tsv | |
``` | |
In case of errors, you can get more info by running the script in debug mode | |
```bash | |
$ python mdvalid.py --debug [Options] | |
``` |