Skip to content

Commit

Permalink
Changed feature counts to process files in alphanumerical order and r…
Browse files Browse the repository at this point in the history
…un again.
  • Loading branch information
MPIBR-unzued committed Jan 18, 2017
1 parent a8b2ba6 commit 5ae4cd8
Showing 1 changed file with 8 additions and 6 deletions.
14 changes: 8 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -131,22 +131,24 @@ featureCounts parameters:
-Q 255
-T 6

Command run:
Change first to directory containing the sorted BAMs:
```bash
cd /storage/schu/ProjectGroups/RNA/Data/RNASequencing/Novogene/NHHW162910/alignments/bams/bamsSortedByName
cd /storage/schu/ProjectGroups/RNA/Data/RNASequencing/Novogene/NHHW162910/alignments/bamsSortedByName
```

Run featureCounts on every BAM file. The files will be read in alphanumerical order (1,2,3...10,11,12...):
```bash
featureCounts -Q 255 -T 6 -p -a /storage/scic/Data/External/Bioinformatics/Rat/Genomes/rn5_Mar2012/annotation/annRefSeq_rn5_20Apr2016.gtf -o /storage/schu/ProjectGroups/RNA/Data/RNASequencing/Novogene/NHHW162910/gene_counts/gene_counts_130117.txt $(echo *.bam)
featureCounts -Q 255 -T 6 -p -a /storage/scic/Data/External/Bioinformatics/Rat/Genomes/rn5_Mar2012/annotation/annRefSeq_rn5_20Apr2016.gtf -o /storage/schu/ProjectGroups/RNA/Data/RNASequencing/Novogene/NHHW162910/gene_counts/gene_counts_180117.txt $(ls *.bam | sort -n -k1.2 | paste -s -d ' ' -)
```
#### Extract mapped features only
Once the counting of the features/genes is done, we extract only those which have at least one read mapped to them:

```bash
awk -F"\t" 'OFS="\t"{if(($7+$8+$9+$10+$11+$12+$13+$14+$15+$16+$17+$18+$19+$20+$21+$22)>0){print $1,$7,$8,$9,$10,$11,$12,$13,$14,$15,$16,$17,$18,$19,$20,$21,$22}}' gene_counts_130117.txt > gene_counts_130117_only_mapped.txt
awk -F"\t" 'OFS="\t"{if(($7+$8+$9+$10+$11+$12+$13+$14+$15+$16+$17+$18+$19+$20+$21+$22)>0){print $1,$6,$7,$8,$9,$10,$11,$12,$13,$14,$15,$16,$17,$18,$19,$20,$21,$22}}' gene_counts_180117.txt > gene_counts_180117_only_mapped.txt
```

Then we add a header to that file with the sample names:
Then we add a header to that file with labels and sample names:

```bash
paste -d '\t' <(echo "gene_name") <(head -1 gene_counts_130117.txt.summary | cut -f2-) | cat - gene_counts_130117_only_mapped.txt > gene_counts_130117_only_mapped_with_header.txt
paste -d '\t' <(echo "GeneId") <(echo "Length") <(head -1 gene_counts_180117.txt.summary | cut -f2- | tr -s '\t' '\n' | awk 'OFS="\t"{split($1, a, "_"); print a[1]}' | paste -s -d '\t' -) | cat - gene_counts_180117_only_mapped.txt > gene_counts_180117_only_mapped_with_header.txt
```

0 comments on commit 5ae4cd8

Please sign in to comment.