Supplemental Material for Lin, Lazarus, and Rhee, 2020

2020-05-18T16:50:07Z (GMT) by Fan Lin Elena Z. Lazarus Seung Rhee

Table S1. Known causal genes and their orthologs in major model plants and crop species.

Table S2. The average value of features used for the Arabidopsis model.

Table S3. The average value of features used for the rice model.

Talbe S4. Function annotation, gene expression data, and protein seqeunce difference of thirteen candidate prioritized by QTG-Finder2 and SD1.

Table S5 putative transcription factor binding sites identified in the promoter of Sevir.5G394900.


Figure S1 Whole-genome synteny map between Setaria viridis and Setaria italica by SynMap

Figure S2 Parameter tuning for the Setaria viridis model based on cross-validation AUC-ROC. Error bars represent standard deviation, N=3.

Figure S3 Parameter tuning for the Sorghum bicolor model based on cross-validation AUC-ROC

Figure S4 Causal-gene orthologs in 12 major crops and model species.

Figure S5 Models trained with different groups of orthologs performed similarly according to external validation.

Figure S6 Feature importance of the newly added features to the Arabidopsis and rice models.

Figure S7 Multiple sequence alignment for a RIO2 protein across grass species, yeast and human using Clustal Omega

Figure S8 Multiple sequence alignment for a RIO2 protein across grass species using Clustal Omega

Figure S9 Multiple sequence alignment of a candidate gene SD1 across grass species using Clustal Omega

Figure S10 A candidate gene encoding a ribosomal protein in the L1P family has higher expression in Setaria italica (Seita.5G389700) than in Setaria viridis (Sevir.5G394900).

Figure S11 Pairwise sequence alignment shows polymorphisms in the putative promoters of an ortholog pair of genes encoding L1P ribosomal proteins.