Skip to content
Navigation Menu
Toggle navigation
Sign in
In this repository
All GitHub Enterprise
↵
Jump to
↵
No suggested jump to results
In this repository
All GitHub Enterprise
↵
Jump to
↵
In this user
All GitHub Enterprise
↵
Jump to
↵
In this repository
All GitHub Enterprise
↵
Jump to
↵
Sign in
Reseting focus
You signed in with another tab or window.
Reload
to refresh your session.
You signed out in another tab or window.
Reload
to refresh your session.
You switched accounts on another tab or window.
Reload
to refresh your session.
Dismiss alert
{{ message }}
proost
/
LSTrAP
Public
Notifications
You must be signed in to change notification settings
Fork
0
Star
0
Code
Issues
4
Pull requests
0
Actions
Projects
0
Security
Insights
Additional navigation options
Code
Issues
Pull requests
Actions
Projects
Security
Insights
Files
4bf7c89
cluster
docs
helper
data
parsers
fasta_to_gtf.py
get_sra_ip.py
htseq_count_stats.py
pca_powerlaw.py
sra_to_fastq.py
tophat_stats.py
pipeline
scripts
utils
.gitignore
LICENSE.md
README.md
config.template.ini
data.template.ini
run.py
Breadcrumbs
LSTrAP
/
helper
/
htseq_count_stats.py
Copy path
Blame
Blame
Latest commit
Sebastian Proost
avoid division by zero
Aug 29, 2016
4bf7c89
·
Aug 29, 2016
History
History
40 lines (30 loc) · 1.29 KB
Breadcrumbs
LSTrAP
/
helper
/
htseq_count_stats.py
Top
File metadata and controls
Code
Blame
40 lines (30 loc) · 1.29 KB
Raw
#!/usr/bin/env python3 """ Script to iterate over HTSEQ_count output and grab quality statistics """ import os import re from sys import argv from collections import defaultdict htseq_path = argv[1] values = defaultdict(list) quality_fields = ['__no_feature', '__ambiguous', '__too_low_aQual', '__not_aligned', '__alignment_not_unique' ] # Get all directories in this path for f in os.listdir(path=htseq_path): if os.path.isfile(os.path.join(htseq_path, f)) and f.endswith('.htseq'): sample = f.replace('.htseq', '') with open(os.path.join(htseq_path, f)) as fin: mapped_reads = 0 for line in fin: gene, value = line.strip().split() if gene not in quality_fields: mapped_reads += int(value) else: values[gene].append(int(value)) values['samples'].append(sample) values['mapped_reads'].append(mapped_reads) print('sample', 'mapped_reads', 'no_feature', 'ambiguous', '% mapped', '% no_feature', '% ambiguous', sep='\t') for s, m, n, a in zip(values['samples'], values['mapped_reads'], values['__no_feature'], values['__ambiguous']): total = sum([m, n, a]) if total > 0: print(s, m, n, a, m*100/total, n*100/total, a*100/total, sep='\t')
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
You can’t perform that action at this time.