Analysis of SRRM2, SON, PRPF8, SRRM1, RBM25, Pinin and Coilin orthologues
data
- generally data filesgenbank
- primary sequence dataset in genbank formatgenbank_final
- final sequence dataset in genbank formatmobiDB
- fasta sequence datasets and MobiDB-Lite resultsNCBI_ortholog
- vertebrate orthologues lists in .csv from NCBI ortholog resourceorthofinder
- list of accession numbers for additional species obtained through Orthofinder2 analysis
results
- results in tabular formatfig
- figures
Notebooks:
-
Data_preparation.ipynb
: dataset preparation and length comparison. Vertebrate SRRM2, SON, PRPF8, SRRM1, RBM25, Pinin and Coilin orthologous protein datasets were downloaded from NCBI’s orthologs and supplemented with orthologues predicted for invertebrate species using OrthoFinder. To visualize the results, protein lengths in each dataset were plotted on swarmplot. -
Disorder_prediction.ipynb
: Disorder prediction using IUPred2a and MobiDB-Lite. Visualized using Matplotlib's heatmap in two variants: sorted by phylogeny (TimeTree) and sorted by protein length.