Supplemental Material for Singleton and Eisen, 2023

posted on 2023-09-21, 14:02 authored by Marc Singleton, Michael Eisen

S1: Orthologous groups of proteins identified in 33 annotated Drosophila genomes using 4-clique percolation of the hit network derived from reciprocal pairwise BLAST searches.

S2: The initial alignments of all sequences in each orthologous group.

S3: The refined alignments of the representative sequences in each single copy orthologous group.

S4: The insertion phylo-HMM model parameters and the alignments of the single copy ortholog groups after trimming.

S5: The missing data phylo-HMM model parameters and the missing segments identified for each sequence in each single copy orthologous group. Missing segments are given as Python slices in the following form: start0-stop0,start1-stop1,\ldots,startn-stopn. If the slices field is empty, then no missing segments were identified in that sequence.

S6: The consensus phylogenetic trees derived from meta-alignments and fit using the LG, GTR, and GTR2 substitution models.


Article title

Leveraging genomic redundancy to improve inference and alignment of orthologous proteins

    G3: Genes|Genomes|Genetics