G3-2021-402773_SuppMaterials.zip (4.63 MB)
Supplemental Materials for Lewald et al., 2021
figure
posted on 2021-09-22, 16:44 authored by Kyle M. Lewald, Antoine Abrieux, Derek A. Wilson, Yoosook Lee, William R. Conner, Felipe AndreazzaFelipe Andreazza, Elizabeth H. Beers, Hannah J. Burrack, Kent M. Daane, Lauren Diepenbrock, Francis A. Drummond, Philip D. Fanning, Michael T. Gaffney, Stephen P. Hesler, Claudio Ioriatti, Rufus IsaacsRufus Isaacs, Brian A. Little, Gregory M. Loeb, Betsey Miller, Dori E. Nava, Dalila Rendon, Ashfaq A. Sial, Cherre S. Bezerra da Silva, Dara G. Stockton, Steven Van Timmeren, Anna Wallingford, Vaughn M. Walton, Xingeng Wang, Bo Zhao, Frank G. Zalom, Joanna C. ChiuFinal Supplemental Materials (8 supplemental figures and 5 supplemental tables) for accepted manuscript G3-2021-402773.
Figure S1: Linkage disequilibrium decay in largest contigs for all samples. Linkage disequilibrium measured by r2 measured pairwise for each SNP within 100kb of each other (0-30kb shown in plot). 95% confidence shaded intervals displayed based on 100 bootstrap replicates. A 1% subsample of the r2 values was used to fit the model of decay. (A-E) display results from contigs 1, 2, 3, 4, and 5-6, respectively, which span chromosomes X, 2, and 3, based on homology with D. melanogaster.
Figure S2: Admixture proportions estimated for each region individually. Samples are labeled by location code, followed by U.S. state abbreviation or country. Brazil not plotted as only one location was sampled. (A) Eastern U.S. samples, 182,786 sites used. (B) Western U.S. samples (including Hawaii), 136,929 sites used. (C) Asian samples, 97,134 sites used. (D) European samples, 77,645 sites used.
Figure S3: PCA calculated for each region individually. Percent variance captured by each principal component indicated in axis labels. (A) Eastern U.S. samples, 183,243 sites used. (B) Western U.S. samples, (including Hawaii), 139,075 sites used. (C) European samples, 90,624 sites used. (D) Asian samples, 103,312 sites used.
Figure S4: Phylogenetic tree using COX2 gene. Maximum likelihood tree of COX2 rooted on D. melanogaster (Dmel01) using the Tamura 3-parameter model + G, with bootstrap fractions greater than 0.5 from 500 replicate runs displayed next to branch points. Branch lengths measure number of substitutions per site. 70 variable sites were analyzed from a total of 720 positions in the alignment.
Figure S5: Admixture proportions estimated from subsampled clusters. (A-C) 5 random individuals were sampled from each population cluster for each analysis. Samples labeled by name, followed by population cluster. For the fourth subsample, see Figure 2C. 144,661-152,424 sites used.
Figure S6: Admixture proportions estimated from all samples combined. Up to 10 clusters (k) were used. Samples are labeled by state if applicable, followed by population cluster. 206,093 sites used.
Figure S7: Trees inferred by treemix with 0 to 10 migration edges (ordered from A-K). Migration edges have not been plotted for readability; please see Table S5 for migration edge information. Bootstrap replicate values label each branch from 100 bootstrapped runs. X-axis measures genetic drift. Fraction variance of the data captured by the model is indicated in bottom right of each plot. Labels “D.sub” and “D.bia” stand for D.subpulchrella and D. biarmipes, respectively.
Figure S8: Plots of standard error of residuals between populations based on models generated by treemix, from 0 to 10 migration edges. High residuals indicate the model underestimates the data’s co-variance, which could be a sign more migrations are needed. Low residuals may indicate the two populations are too close in the graph due to unmodeled migration elsewhere.
SUPPLEMENTAL TABLE LEGENDS
Table S1: Details for all Drosophila suzukii samples sequenced, indicating collection time and location, as well as sequencing lane number, inbreeding coefficient, and average coverage of sequencing data.
Table S2: Details of sequences used in COX2 phylogenetic analysis. Haplotypes were named using abbreviation of species presumed prior to analysis. Note that Dsuz01, Dsuz03, Dsuz04, and Dsuz05 are likely not D. suzukii.
Table S3: F3 statistics estimated by treemix between all combinations of populations. Significantly negative values for the F3 test (A;B,C) indicate population A experienced gene flow from population B and C.
Table S4: F4 statistics estimated by treemix between all combinations of populations. Significantly positive values for the F4 test (A,B;C,D) suggest gene flow between A and C or B and D, while negative values indicate gene flow between A and D or B and C.
Table S5: Migrations inferred by treemix when different number of migrations (“m”) are allowed, from 1 to 10. Migrations are taken from the maximum likelihood admixture graph for each value of “m”.