tirank.SCSTpreprocess.perform_sampling_on_RNAseq

tirank.SCSTpreprocess.perform_sampling_on_RNAseq(savePath, mode='SMOTE', threshold=0.5)[source]

Performs sampling (over- or under-sampling) on the bulk training data.

This function is used to correct for class imbalance in ‘Classification’ mode. It loads the training data, applies the specified sampling method, and overwrites the training files with the resampled data.

Parameters:
  • savePath (str) – The main project directory path.

  • mode (str, optional) – The sampling method to use. One of ‘SMOTE’, ‘downsample’ (RandomUnderSampler), ‘upsample’ (RandomOverSampler), or ‘tomeklinks’ (TomekLinks). Defaults to “SMOTE”.

  • threshold (float, optional) – The imbalance threshold. Sampling is only performed if the minority class proportion is below this value. Defaults to 0.5.

Returns:

None