Supplemental Material for Pflug et al., 2020
Datasets usually provide raw data for analysis. This raw data often comes in spreadsheet form, but can be any collection of data, on which analysis can be performed.
Figures S1-S4. Genome size estimates (in Mb) from GenomeScope and CovEST at different values of K for S1. Bembidion sp. nr. transversale, S2. Chlaenius sericeus, S3. Lionepha tuulukwa, and S4. Pterostichus melanarius.
Figure S5. Average per-base read coverage for the first 200 bases in loci of the Regier set for four specimens.
Figure S6. Relative red fluorescence and the number of nuclei counted at each level fluorescence level of representative Chlaenius sericeus, Lionepha tuulukwa, and Pterostichus melanarius compared to a Drosophila virilis standard.
Figure S7. Percent of reads inferred to contain repetitive elements as inferred by RepeatExplorer from a sample of 500,000 read pairs for each of eight beetle species, with reads classified to major group of repetitive elements.
Figure S8. Smudgeplot results for Bembidion sp. nr. transversale.
The supplemental tables are:
Table S1. Information on genomic specimens sequenced.
Table S2. Flow cytometry values for specimens examined.
Table S3. Information on transcriptomic specimens sequenced.
Table S4. Additional details for methods used on transcriptomic sequencing specimens.
Table S5. Additional details for methods used on genomic sequencing specimens.
Table S6. Results of BUSCO analysis on eight genome assemblies using the 2442 gene Endopterygota odb9 reference set.
Table S7. Results of BUSCO analysis on six transcriptome assemblies using the 2442 gene Endopterygota odb9 reference set.
Table S8. GenomeScope results.
Table S9. Summary of read mapping genome size estimates for the Regier and OrthoDB gene sets using three different read filtering methods.
Table S10. CovEST genome size (in Mb) and coverage estimates for two models, Basic and Repeat, performed using a k value of 21.
Table S11. Regier read mapping mean coverages before and after removing outliers using the 3*IQR rule.
Table S12. OrthoDB read mapping mean coverages before (“Untrimmed”) and after (“IQR Trim”) removing outliers more than three interquartiles from the median.
Table S13. Comparison of total repetitive DNA sequence (in Mb) estimated by RepeatExplorer and GenomeScope.
Table S14. Genome size estimates of model species (Arabidopsis thaliana, Caenorhabditis elegans, and Drosophila melanogaster) using sequence-based methods.
Table S15. Summary statistics of model organism genome size results using sequence-based methods.
Supplemental File 1. Detailed results from ANOVA analyses of different sizes of k-mer.
Supplemental File 2. Reference sequences for Regier loci in FASTA format.
Supplemental File 3. Reference sequences for OrthoDB loci in FASTA format.
Supplemental File 4. Identification numbers of OrthoDB v9.1 genes selected for the OrthoDB reference set.