Changed feature counts to process files in alphanumerical order and r…

…un again.
MPIBR-Bioinformatics · Jan 18, 2017 · 5ae4cd8 · 5ae4cd8
1 parent a8b2ba6
commit 5ae4cd8
Showing 1 changed file with 8 additions and 6 deletions.
diff --git a/README.md b/README.md
@@ -131,22 +131,24 @@ featureCounts parameters:
 -Q 255
 -T 6
 
-Command run:
+Change first to directory containing the sorted BAMs:
 ```bash
-cd /storage/schu/ProjectGroups/RNA/Data/RNASequencing/Novogene/NHHW162910/alignments/bams/bamsSortedByName
+cd /storage/schu/ProjectGroups/RNA/Data/RNASequencing/Novogene/NHHW162910/alignments/bamsSortedByName
 ```
+
+Run featureCounts on every BAM file. The files will be read in alphanumerical order (1,2,3...10,11,12...):
 ```bash
-featureCounts -Q 255 -T 6 -p -a /storage/scic/Data/External/Bioinformatics/Rat/Genomes/rn5_Mar2012/annotation/annRefSeq_rn5_20Apr2016.gtf -o /storage/schu/ProjectGroups/RNA/Data/RNASequencing/Novogene/NHHW162910/gene_counts/gene_counts_130117.txt $(echo *.bam)
+featureCounts -Q 255 -T 6 -p -a /storage/scic/Data/External/Bioinformatics/Rat/Genomes/rn5_Mar2012/annotation/annRefSeq_rn5_20Apr2016.gtf -o /storage/schu/ProjectGroups/RNA/Data/RNASequencing/Novogene/NHHW162910/gene_counts/gene_counts_180117.txt $(ls *.bam | sort -n -k1.2 | paste -s -d ' ' -)
 ```
 #### Extract mapped features only
 Once the counting of the features/genes is done, we extract only those which have at least one read mapped to them:
 
 ```bash
-awk -F"\t" 'OFS="\t"{if(($7+$8+$9+$10+$11+$12+$13+$14+$15+$16+$17+$18+$19+$20+$21+$22)>0){print $1,$7,$8,$9,$10,$11,$12,$13,$14,$15,$16,$17,$18,$19,$20,$21,$22}}' gene_counts_130117.txt > gene_counts_130117_only_mapped.txt
+awk -F"\t" 'OFS="\t"{if(($7+$8+$9+$10+$11+$12+$13+$14+$15+$16+$17+$18+$19+$20+$21+$22)>0){print $1,$6,$7,$8,$9,$10,$11,$12,$13,$14,$15,$16,$17,$18,$19,$20,$21,$22}}' gene_counts_180117.txt > gene_counts_180117_only_mapped.txt
 ```
 
-Then we add a header to that file with the sample names:
+Then we add a header to that file with labels and sample names:
 
 ```bash
-paste -d '\t' <(echo "gene_name") <(head -1 gene_counts_130117.txt.summary | cut -f2-) | cat - gene_counts_130117_only_mapped.txt > gene_counts_130117_only_mapped_with_header.txt
+paste -d '\t' <(echo "GeneId") <(echo "Length") <(head -1 gene_counts_180117.txt.summary | cut -f2- | tr -s '\t' '\n' | awk 'OFS="\t"{split($1, a, "_"); print a[1]}' | paste -s -d '\t' -) | cat - gene_counts_180117_only_mapped.txt > gene_counts_180117_only_mapped_with_header.txt
 ```