posted on 2024-11-14, 17:07authored byDan VitaleDan Vitale, Mathew J. Koretsky, Nicole Kuznetsov, Samantha Hong, Jessica Martin, Mikayla James, Mary B. Makarious, Hampton Leonard, Hirotaka Iwaki, Faraz Faghri, Cornelis Blauwendraat, Andrew B. Singleton, Yeajin Song, Kristin Levine, Ashwin Ashok Kumar Sreelatha, Zih-Hua Fang, Mike Nalls
nba_v1.zip, neurochip_v1.zip, and and wgs_v1.zip are all pre-trained models specific to the NeuroBooster Array, NeuroChip Array, and 1000 Genomes WGS, respectively. The models are serialized in Pickle format and accompanied by a file with overlapping SNPs between each sequencing/genotyping platform and the reference panel.
1kg_30x_hgdp_ashk_ref_panel.zip contains the reference panel used for ancestry model training in GenoTools in Plink bed/bim/fam format. It is accompanied by a "..._labels.txt" file that lists each sample in the dataset and its respective ancestry label for the training/testing pipeline.
History
Article title
GenoTools: An Open-Source Python Package for Efficient Genotype Data Quality Control and Analysis