Basic Statistics
Summary
The Basic Statistics module generates some simple composition statistics for the file analysed.
- File Name:The original filename of the file which was analysed
- File type:Says whether the file appeared to contain actual base calls or colorspace data which had to be converted to base calls
- Phred Encoding:Says which ASCII encoding of quality values was found in this file.
- Total Reads:A count of the total number of sequences processed.
- Total Bases:A count of the total number of bases in all sequences processed.
- Total T Bases:A count of the total number of base 'T' in all sequences processed.
- Total C Bases:A count of the total number of base 'C' in all sequences processed.
- Total G Bases:A count of the total number of base 'G' in all sequences processed.
- Total A Bases:A count of the total number of base 'A' in all sequences processed.
- Total N Bases: A count of the total number of base 'N' in all sequences processed , where 'N' is the base that cannot be recognized by sequencing.
- %GC:The overall %GC of all bases in all sequences
- Min Length:The length of the shortest sequence in the set.
- Max Length:The length of the highest sequence in the set.
- Lowest Char: The lowest quality char in all sequences processed.
- Highest Char: The highest quality char in all sequences processed.
Example
"basic_stats":{
"file_name":"test.fastq.gz",
"file_type":"",
"phred":{
"name":"Sanger / Illumina 1.9",
"offset":33
},
"total_reads":250000,
"total_bases":37500000,
"t_count":8473918,
"c_count":10050719,
"g_count":9888266,
"a_count":9086549,
"n_count":548,
"gc_percentage":0.5317062666666666,
"min_length":150,
"max_length":150,
"lowest_char":35,
"highest_char":70
},