Supplemental Material for Dyson et al., 2022
datasetposted on 2022-10-14, 19:36 authored by Carl J. Dyson, Aaron Pfennig, Daniel Ariano-SánchezDaniel Ariano-Sánchez, Joseph Lachance, Joseph R. Mendelson III, Michael A. D. Goodisman
Supporting data for "Genome of the endangered Guatemalan Beaded Lizard, Heloderma charlesbogerti, reveals evolutionary relationships of squamates and declines in effective population sizes"
Carl J. Dyson; Aaron Pfennig; Daniel Ariano-Sánchez; Joseph Lachance; Joseph R. Mendelson III; Michael A. D. Goodisman
- We sequenced the genome of a male Guatemalan Beaded Lizard from Zoo Atlanta using PacBio Sequel II SMRTCell sequencing platform. Sequencing insert libraries were created consisting of 15-20 kb sequences and 30+ kb sequences, and underwent on-site quality assessment and cleanup prior to sequencing. We obtained 232 Gbp of raw sequencing data (86x depth). The raw reads were assembled into a draft genome with a total length of 2.31 Gb. The assembly comprised 3,551 contigs, 83% of which were 50 kb or larger in size. Evaluation of the draft assembly produced a contig N50 of 1,358,783 bp. The overall GC content of the genome was estimated to be 45.05%. Approximately 57.54% of the genome consisted of identifiable repetitive DNA. We predicted 31,411 protein-coding genes and 32,205 distinct mRNAs. This draft genome of this critically endangered species can contribute to conservation efforts by providing information on historical changes in population size, and help further our understanding of an understudied taxon within the venomous lizards.
- Heloderma_charlesbogerti_genomic.fasta.gz: Assembled genome sequences of Heloderma charlesbogerti using Flye.
- Heloderma_charlesbogerti_genomic.gff.gz: Gene features in GFF format. Protein-coding genes were predicted by BRAKER2 and tRNA genes by tRNAscan-SE2.
- Heloderma_charlesbogerti_cds_from_genomic.fna.gz: Nucleotide sequences of predicted protein-coding regions,
- Heloderma_charlesbogerti_protein.faa.gz: Amino acid sequences of predicted protein-coding regions,
- Heloderma_charlesbogerti_protein.interproscan.gff.gz: Interproscan annotations of predicted proteins in GFF format.
- Heloderma_charlesbogerti_protein.interproscan.tsv.gz: Interproscan annotations of predicted proteins in tab-separated format.
- Heloderma_charlesbogerti_RepeatMasker.gff.gz: Repeat annotation in GFF format. Repeat annotation was done de novo using RepeatModeler and RepeatMasker.
- Heloderma_charlesbogerti_RepeatMasker.out.gz: RepeatMasker annotation.
- Heloderma_charlesbogerti_RepeatMasker.tbl.gz: Summary of RepeatMasker annotation.
- Heloderma_charlesbogerti_TRF.gff.gz: Long tandem repeat anotation using TRF.
- Heloderma_charlesbogerti_genomic_softmasked.fasta.gz: Softmasked genome sequences of Heloderma charlesbogerti. Combined annotation of RepeatMasker and TRF were used for softmasking.