sushie.infer.make_cs(alpha: Array | ndarray | bool | number | bool | int | float | complex, ns: Array | ndarray | bool | number | bool | int | float | complex, Xs: Array | ndarray | bool | number | bool | int | float | complex | None = None, lds: Array | ndarray | bool | number | bool | int | float | complex | None = None, threshold: float = 0.9, purity: float = 0.5, purity_method: str = 'weighted', max_select: int = 500, seed: int = 12345) Tuple[DataFrame, DataFrame, Array, Array][source]

The function to compute the credible sets.

Parameters:
alpha: Array | ndarray | bool | number | bool | int | float | complex

\(L \times p\) matrix that contains posterior probability for SNP to be causal (i.e., \(\alpha\) in Model Description).

Xs: Array | ndarray | bool | number | bool | int | float | complex | None = None

Genotype data for multiple ancestries. It cannot be None if lds is None.

lds: Array | ndarray | bool | number | bool | int | float | complex | None = None

LD matrix for multiple ancestries. It cannot be None if Xs is None.

ns: Array | ndarray | bool | number | bool | int | float | complex

Sample size for each ancestry.

threshold: float = 0.9

The credible set threshold.

purity: float = 0.5

The minimum pairwise correlation across SNPs to be eligible as output credible set.

purity_method: str = 'weighted'

The method to compute purity across ancestries.

max_select: int = 500

The maximum number of selected SNPs to compute purity.

seed: int = 12345

The randomization seed for selecting SNPs in the credible set to compute purity.

Returns:

A tuple of
  1. credible set (pd.DataFrame) after pruning for purity,

  2. full credible set (pd.DataFrame) before pruning for purity.

  3. PIPs (Array) across \(L\) credible sets.

  4. PIPs (Array) across credible sets that are not pruned. An array of zero if all credible sets

    are pruned.

Return type:

Tuple[pd.DataFrame, pd.DataFrame]


Last update: Oct 27, 2024