Supplemental Material for Kono et al., 2018
datasetposted on 20.07.2018 by Thomas Kono, Alex B. Brohammer, Suzanne E. McGaugh, Candice N. Hirsch
Datasets usually provide raw data for analysis. This raw data often comes in spreadsheet form, but can be any collection of data, on which analysis can be performed.
Figure S1. Weighted pairwise similarity distribution for adjacent genes in B73 and PH207. Solid lines are from all pairs of adjacent genes in the genome and the dashed lines are from pairs of adjacent genes defined as tandem duplicates from raw CoGe output. Green line at 0.3 marks the threshold used to define tandem duplicate genes for downstream analysis.
Figure S2. Tandem duplicate gene cassette identification. Similarity heatmap on right shows an example tandem duplicate gene cassette in which the off-diagonal heat (yellow) shows high similarity among genes within the cassette.
Figure S3. Distribution of orthologous group sizes as defined by OrthoFinder. Grey box (size 10 to 75) indicates orthogroups that were used in downstream PAML analysis.
Figure S4. Example marked input tree for PAML relative rates analysis. Orthologous groups were defined using OrthoFinder.
Figure S5. Number of intervening genes in tandem duplicate gene cluster. Genes that are directly adjacent have an intervening gene number of zero.
Figure S6. Genomic locations of maize tandem duplicates for all chromosomes in B73. Purple ticks show tandem duplications. Black line shows gene density, dark grey line shows RNA transposable element density, light grey line shows DNA transposable elements per Mb. Subgenome 1 is shown in green shading and subgenome 2 is shown in blue shading.
Figure S7. Distance to nearest transposable element (upstream or downstream) for B73 tandem duplicate genes.
Table S1. Species, assembly versions, annotation versions, and data sources for the grass species used for orthologue identification.
Table S2. Summary of maize tandem duplicates in B73 and PH207. Cluster number is a generic number given to each tandem duplicate cluster. Duplicates in B73 are from the version 4 assembly and duplicates in PH207 are from the version 1 assembly. Shared duplicates are contained in the syntenic portion of both B73 and PH207 and private duplicates are in the syntenic portion of only one genome, and non-syntenic duplicates are in non co-orthologous blocks relative to rice and/or sorghum. Estimated date of tandem duplicates was determined based on substitution rates relative to sorghum.