What it does
KmerFreq is one of two programs which is used to correct sequencing errors based on the kmer frequency spectrum (KFS). Since it assumes that most low frequency Kmers have been generated by sequencing errors, the key to its error correction functionality is to distinguish the rate of the low and high frequency Kmers. The use of larger Kmer sizes provides better results but conversely requires more computing resources. In order to produce a more accurate result, the trimmed length and deletion ratio is balanced with the accuracy level. A practical Kmer size should be chosen based on the genome characteristic.
Note that 30X data is preferred for calculation of the Kmer frequency spectrum.
When kmer size is less than 17 bp, KmerFreq_AR and Corrector_AR should be used because it will be faster than using this HA version. Memory usage will also be less than 16GB (15mer, 1G; 16mer, 4G; 17mer, 16G) for KFS construction. Also, KmerFreq_AR supports space-kmer in KFS construction and Corrector_AR supports Duo-kmer (consecutive and space kmer) in the correction process.
When kmer sizes larger than 17bp are to be processed, the HA versions of KmerFreq and Corrector should be used since less memory is required for KFS construction.
Outputs
Two output files are generated by KmerFreq: