tirank.TrainPre.Reject_With_GMM_Bio
- tirank.TrainPre.Reject_With_GMM_Bio(pred_bulk, pred_sc, tolerance, min_components, max_components)[source]
Performs GMM-based rejection for Classification and Cox modes.
This function identifies phenotype-associated clusters by fitting a GMM to the bulk scores (to find target means 0 and 1) and another GMM to the sc/st scores, then finding sc/st clusters whose means align with the bulk targets within a given tolerance.
- Parameters:
pred_bulk (np.ndarray) – Predicted scores from the bulk data (n_samples, 1).
pred_sc (np.ndarray) – Predicted scores from the sc/st data (n_cells, 1).
tolerance (float) – The maximum distance a sc/st cluster mean can be from a bulk target mean to be considered aligned.
min_components (int) – The minimum number of GMM components to try.
max_components (int) – The maximum number of GMM components to try.
- Returns:
- A binary mask (n_cells, 1) where 1 indicates a cell
to be rejected (phenotype-independent) and 0 indicates a cell to be kept.
- Return type:
np.ndarray