Supplemental Material for Sless, Searle, and Danforth, 2022
datasetposted on 2022-06-22, 13:37 authored by Trevor J. L. Sless, Jeremy B. Searle, Bryan N. Danforth
"1 Assembly.zip" contains the contig-level genome assembly of the brood parasitic bee Holcopasites calliopsidis. Also included are computed statistics from QUAST and the results of a BUSCO analysis to assess the completeness of the assembly.
"2 Repetitive Elements.zip" contains results files from RepeatMasker and RepeatModeler runs used to identify repetitive DNA sequences within the genome. The first of the two RepeatMasker runs used a public library of canonical repetitive sequences (Dfam). The output of this run was then fed into the second run using a custom species-specific repeat library generated by RepeatModeler.
"3 Annotation.zip" contains data from annotation pipelines run on the genome assembly. BRAKER2 was first used to identify putative genes using both Augustus and GeneMark. Proteins and transcripts of annotated genes were extracted using EvidenceModeler and gffread. These loci were then functionally annotated with parallel runs of two software pipelines InterProScan and Blast2GO.
"4 Orthology.zip" contains the results from orthology detection analyses of the putative coding regions from the H. calliopsidis genome. OrthoFinder was used to identify orthogroups based on this species and thirteen additional hymenopteran taxa, with Drosophila melanogaster as an outgroup. The program CAFE was then used to identify rapidly evolving gene families among these orthogroups. Since the species tree produced by OrthoFinder was not in agreement with taxonomic consensus, a custom tree was created as input for CAFE using dates from Peters et al. 2017.