Supplementary Material for Lantican et al., 2019

Figure S1 shows the basic statistics and per base sequence analysis result of the pre-processed Illumina Miseq short read sequences. Figure S2 depicts the genome size estimation of the CATD coconut variety based on the generated homozygous k-mer peak. Figure S3 shows the BUSCO analysis of the constructed genome assembly based on 1440 plant-specific genes in the OrthoDB database. Figure S4 shows the distribution and characterization of the repeat elements characterized in the coconut ‘CATD’ genome draft assembly. Figure S5 presents the molecular phylogenetic relationships of the core LTR-RT in the coconut ‘Catigan Green Dwarf’ (CATD). Figure S6 shows the predicted gene ontology (GO) distribution of the genes in coconut genome. Figure S7 predicts the genome-wide identification and characterization of resistance gene analogs (RGA) in the coconut genome. Figure S8 shows the distribution of the drought-response gene homologs classified based on characterized biological function in coconut. Figure S9 presents the proportion of SSR motif found in the current assembly of the ‘CATD’ genome. Figure S10 depicts the genome-wide occurrence of top paired-motif in coconut (‘CATD’) SSRs. Table S1 compares the assembly and quality statistics of ‘CATD’ coconut genome vs. HAT coconut genome, and other closely related sequenced genomes. Table S2 compares the quality of genome annotation of the assembled ‘CATD’ coconut genome with the annotated HAT genome, and other closely related sequenced genome. Table S3 list of genome-wide transcription factors and other transcriptional regulators identified in the predicted genome models of ‘CATD’ genome. Table S4 presents the BLASTn output of the alignment of oil biosynthesis cDNA sequences in the coconut ‘CATD’ genome. Table S5 lists the developed SSR markers physically linked to economically important traits in coconut. Data S1 lists the LTR-RT found in regions of the dwarf coconut genome with estimated insertion dates. Data S2 shows the dag-chainer output file of ‘CATD’-date palm alignment. Data S3 contains the FASTA sequences of the core LRT-RT in coconut genome. Data S4 contains the BLASTp alignment result of the coconut predicted gene models to DroughtDB proteins. Data S5 lists the genome-wide SSR markers designed in coconut.