-
sushie.io.read_vcf(path: str) → Tuple[DataFrame, DataFrame, Array][source]
- Read in genotype data in vcf format.
cyvcf2 package is used to read in the vcf file.
gt_types are used to determine the genotype matrix. It it is UNKNOWN, it will be coded as NA.
- Parameters:
- path: str
The path for vcf genotype data (full file name). It will count REF allele.
- Returns:
- A tuple of
SNP information (bim; pd.DataFrame
),
participants information (fam; pd.DataFrame
),
genotype matrix (bed; Array
).
- Return type:
Tuple[pd.DataFrame, pd.DataFrame, Array]
Last update:
Oct 27, 2024