Benchmarking Universal Single-Copy Orthologs (BUSCO) Analysis

Benchmarking Universal Single-Copy Orthologs (BUSCO) Analysis

NCBI RefSeq genome annotations produced using the Eukaryotic Genome Annotation Pipeline (EGAP) undergo quality assessment, including evaluation by BUSCO which assesses the completeness of the annotations by identifying a set of highly conserved single-copy orthologs. NCBI RefSeq utilizes BUSCO in “protein” mode, where the BUSCO models specific to the most appropriate lineage based on NCBI taxonomy are compared against the longest protein sequence for each annotated coding gene. The results are presented in standard BUSCO notation:

  • C: Complete
    • S: Single-copy
    • D: Duplicated
  • F: Fragmented
  • M: Missing
  • n: Number of genes analyzed
Generated May 30, 2024