Run the ea-utils program fastq-join to join overlapping mate pairs.
This App joins overlapping mate pairs into single reads using the FASTQ-JOIN program in the EA Utils package. Reads must be provided in an interleaved format, where reverse reads immediately follow their corresponding forward reads. The FASTQ-JOIN utility performs read pairing based on the percent similarity and length of the overlap. Ideally, the length of overlap should be at least ten base pairs; otherwise there is insufficient information with which to perform the read pairing. Note: depending on how the libraries were prepared there may be little to no expected overlap. In rare instances, the reverse reads may not be available, and read pairing would serve no purpose.
If successful, this App will produce a SingleEndLibrary object (joined reads) useful for downstream applications, such as assembly. In addition, this App will produce a PairedEndLibrary object (unjoined reads). Summary statistics are also provided, and include the total number of input reads, total number of joined reads, average length of the joined region, standard deviation of the length of the joined region, and the ea-utils version used.
Team members who developed & deployed algorithm in KBase: Dylan Chivian. For questions, please contact us.
- Arkin AP, Cottingham RW, Henry CS, Harris NL, Stevens RL, Maslov S, et al. KBase: The United States Department of Energy Systems Biology Knowledgebase. Nature Biotechnology. 2018;36: 566. doi: 10.1038/nbt.4163 , https://www.nature.com/articles/nbt.4163
- EA Utils package source: , https://expressionanalysis.github.io/ea-utils/
Module Commit: c486d497f6f1faf34645f4de0fc001060088f142