tirank.Visualization
- tirank.Visualization.create_tensor(data_matrix)[source]
Converts a numpy array or list-like object to a float32 PyTorch tensor.
- Parameters:
data_matrix (np.ndarray or list) – The input data matrix.
- Returns:
The data converted to a float32 PyTorch tensor.
- Return type:
torch.Tensor
- tirank.Visualization.plot_loss(train_loss_dict, savePath='./loss_on_epoch.png')[source]
Plots and saves the training loss curves over epochs.
This function takes a dictionary of loss values recorded at each epoch, plots the trends for each loss type on a single graph, and saves the plot to a file.
- Parameters:
train_loss_dict (dict) – A dictionary where keys are epoch identifiers (e.g., ‘Epoch_1’) and values are dictionaries mapping loss names (e.g., ‘total_loss’) to their numerical values.
savePath (str, optional) – The file path to save the resulting loss plot. Defaults to “./loss_on_epoch.png”.
- Returns:
None
- tirank.Visualization.model_predict(model, data_tensor, mode)[source]
Generates predictions from a trained model based on the specified mode.
- Parameters:
model (torch.nn.Module) – The trained PyTorch model to use for prediction.
data_tensor (torch.Tensor) – The input data as a PyTorch tensor.
mode (str) – The operational mode, determining how to interpret the model’s output. Expected values are “Cox”, “Classification”, or “Regression”.
- Returns:
- A tuple containing:
pred_label (np.ndarray): Predicted labels. For “Classification”, these are the class indices. For “Regression” and “Cox”, this is the same as pred_prob.
pred_prob (np.ndarray): Predicted probability scores. For “Classification”, this is the probability of class 1.
- Return type:
tuple
- tirank.Visualization.plot_score_distribution(savePath)[source]
Plots the density distribution of prediction scores for bulk and single-cell data.
This function loads prediction dataframes for bulk and single-cell experiments from pickled files, plots their “Pred_score” distributions on a single density plot, and saves the figure.
- Parameters:
savePath (str) – The root directory containing the ‘3_Analysis’ subfolder, which must hold ‘saveDF_bulk.pkl’ and ‘saveDF_sc.pkl’.
- Returns:
None
- tirank.Visualization.plot_score_umap(savePath, infer_mode)[source]
Visualizes TiRank prediction scores and labels on UMAP and spatial plots.
This function loads an AnnData object and corresponding prediction scores. It then generates and saves visualization plots based on the inference mode. - For “SC” (single-cell) mode, it saves UMAP plots colored by score and label. - For “ST” (spatial) mode, it saves UMAP and spatial plots colored by
score and label.
- Parameters:
savePath (str) – The base directory containing ‘2_preprocessing’ and ‘3_Analysis’ subdirectories.
infer_mode (str) – The type of data being plotted, either “SC” or “ST”.
- Returns:
None
- Raises:
ValueError – If infer_mode is not “SC” or “ST”.
- tirank.Visualization.plot_label_distribution_among_conditions(savePath, group)[source]
Plots the proportional distribution of TiRank labels within different groups.
This function loads prediction scores and calculates the frequency and proportion of each ‘Rank_Label’ (‘Rank+’, ‘Rank-’, ‘Background’) within the categories of a specified ‘group’ column (e.g., cell type, cluster). It then saves a bar plot of these proportions.
- Parameters:
savePath (str) – The base directory containing the ‘3_Analysis’ subfolder.
group (str) – The column name in ‘spot_predict_score.csv’ to use for grouping the data.
- Returns:
None
- Raises:
ValueError – If the specified group column is not found in the loaded DataFrame.
- tirank.Visualization.plot_STmap(savePath, group)[source]
Generates a composite spatial map for ST data showing cluster hubs.
This function is for Spatial Transcriptomics (ST) data. It loads prediction scores, cluster-to-rank mappings from a JSON file, and the AnnData object. It creates a new ‘new_Rank_Label’ based on the hub classification (‘Rank+’, ‘Rank-’, ‘Background’) of each spot’s group. It then saves a figure with three subplots: 1. Spatial plot colored by the original group. 2. The H&E image alone. 3. Spatial plot colored by ‘new_Rank_Label’ overlaid on the H&E image.
- Parameters:
savePath (str) – The base directory containing ‘2_preprocessing’ and ‘3_Analysis’.
group (str) – The column name used for grouping (e.g., ‘cluster’) which corresponds to the JSON file (f”{group}_category_dict.json”).
- Returns:
None
- Raises:
ValueError – If the specified group column is not found in the loaded DataFrame.
- tirank.Visualization.DEG_analysis(savePath, fc_threshold=2, Pvalue_threshold=0.05, do_p_adjust=True)[source]
Performs and saves Differential Gene Expression (DEG) analysis.
This function loads a finalized AnnData object, computes DEGs between ‘Rank+’ and ‘Rank-’ groups using ‘wilcoxon’, saves all results, and then saves a filtered list of DEGs based on log-fold-change and p-value thresholds.
- Parameters:
savePath (str) – The base directory containing the ‘3_Analysis’ subfolder, where ‘final_anndata.h5ad’ is located and results will be saved.
fc_threshold (float, optional) – The fold-change threshold for filtering. Defaults to 2.
Pvalue_threshold (float, optional) – The p-value threshold for filtering. Defaults to 0.05.
do_p_adjust (bool, optional) – If True, use adjusted p-values for filtering. If False, use raw p-values. Defaults to True.
- Returns:
None
- tirank.Visualization.DEG_volcano(savePath, fc_threshold=2, Pvalue_threshold=0.05, do_p_adjust=True, top_n=5)[source]
Generates and saves a volcano plot for DEG results.
This function loads the ‘All DEGs dataframe.csv’ file, creates a volcano plot (Log2(FoldChange) vs -Log10(P-value)), colors genes based on significance thresholds, and annotates the top N most significant up- and down-regulated genes.
- Parameters:
savePath (str) – The base directory containing the ‘3_Analysis’ subfolder.
fc_threshold (float, optional) – Fold-change threshold for coloring and vertical lines. Defaults to 2.
Pvalue_threshold (float, optional) – P-value threshold for coloring and the horizontal line. Defaults to 0.05.
do_p_adjust (bool, optional) – If True, use adjusted p-values for the Y-axis and filtering. If False, use raw p-values. Defaults to True.
top_n (int, optional) – The number of top up- and down-regulated genes to annotate. Defaults to 5.
- Returns:
None
- tirank.Visualization.Pathway_Enrichment(savePath, database='KEGG_2016')[source]
Performs and plots pathway enrichment analysis on DEGs.
This function loads the filtered ‘Differentially expressed genes data frame.csv’, separates genes into up-regulated and down-regulated lists, and runs ‘gseapy.enrichr’ on the up, down, and all DEG lists using the specified database. It saves the enrichment tables and dot plots.
- Parameters:
savePath (str) – The base directory containing the ‘3_Analysis’ subfolder.
database (str or list, optional) – The gene set library or libraries to use for enrichment (e.g., “KEGG_2016”, [“GO_Biological_Process_2021”]). Defaults to “KEGG_2016”.
- Returns:
None
- tirank.Visualization.evaluate_on_test_data(model, test_set, data_path, save_path, bulk_gene_pairs_mat)[source]
Evaluates the model on external bulk RNA-seq test datasets.
This function iterates through a list of test dataset IDs. For each dataset, it loads the expression and clinical metadata, transforms the expression data into the gene-pair format, predicts labels using the model, and saves the predictions along with a confusion matrix plot.
- Parameters:
model (torch.nn.Module) – The trained classification model.
test_set (list of str) – A list of dataset identifiers (e.g., [‘GSE_ID1’]) to be loaded from data_path.
data_path (str) – The directory containing the test data files, which should be named like ‘{data_id}_meta.csv’ and ‘{data_id}_exp.csv’.
save_path (str) – The root directory where results will be saved. A ‘bulk_test’ subdirectory will be created here.
bulk_gene_pairs_mat (pd.DataFrame) – The gene-pair matrix used as a template to transform the test expression data.
- Returns:
None
- tirank.Visualization.create_boxplot(data, title, ax, group_column='True Label', score_column='Predicted Score')[source]
Creates a boxplot on a given axis with a Mann-Whitney U test.
- Parameters:
data (pd.DataFrame) – DataFrame containing the plot data.
title (str) – Title for the subplot.
ax (matplotlib.axes.Axes) – The matplotlib axis to plot on.
group_column (str, optional) – The column for the x-axis groups (must contain two groups, 0 and 1). Defaults to “True Label”.
score_column (str, optional) – The column for the y-axis numerical values. Defaults to “Predicted Score”.
- Returns:
None
- tirank.Visualization.create_density_plot(data, label, ax, title)[source]
Creates a single density (KDE) plot on a given axis.
- Parameters:
data (pd.Series or np.ndarray) – The data to plot.
label (str) – The label for the data series in the legend.
ax (matplotlib.axes.Axes) – The matplotlib axis to plot on.
title (str) – Title for the subplot.
- Returns:
None
- tirank.Visualization.create_hist_plot(data, ax, title)[source]
Creates a histogram with a KDE overlay on a given axis.
- Parameters:
data (pd.Series or np.ndarray) – The data to plot.
ax (matplotlib.axes.Axes) – The matplotlib axis to plot on.
title (str) – Title for the subplot.
- Returns:
None
- tirank.Visualization.create_comparison_density_plot(data1, label1, data2, label2, ax, title)[source]
Creates a density plot comparing two distributions on a given axis.
- Parameters:
data1 (pd.Series or np.ndarray) – The first data series.
label1 (str) – Label for the first data series.
data2 (pd.Series or np.ndarray) – The second data series.
label2 (str) – Label for the second data series.
ax (matplotlib.axes.Axes) – The matplotlib axis to plot on.
title (str) – Title for the subplot.
- Returns:
None
- tirank.Visualization.plot_genepair(df, data_type, savePath=None)[source]
Plots and saves a clustered heatmap of a gene-pair matrix.
If the input DataFrame has more rows than columns, it is sampled to be square. Hierarchical clustering (‘average’ linkage, ‘euclidean’ metric) is then applied to both rows and columns, and the resulting reordered DataFrame is plotted as a heatmap.
- Parameters:
df (pd.DataFrame) – The gene-pair DataFrame (e.g., samples vs. gene-pairs).
data_type (str) – A string identifier (e.g., “bulk”, “sc”) used to name the output file.
savePath (str, optional) – The root directory containing the ‘2_preprocessing’ subfolder where the plot will be saved. Defaults to None.
- Returns:
None
Functions
Performs and saves Differential Gene Expression (DEG) analysis. |
|
Generates and saves a volcano plot for DEG results. |
|
Performs and plots pathway enrichment analysis on DEGs. |
|
Creates a boxplot on a given axis with a Mann-Whitney U test. |
|
Creates a density plot comparing two distributions on a given axis. |
|
Creates a single density (KDE) plot on a given axis. |
|
Creates a histogram with a KDE overlay on a given axis. |
|
Converts a numpy array or list-like object to a float32 PyTorch tensor. |
|
Evaluates the model on external bulk RNA-seq test datasets. |
|
Generates predictions from a trained model based on the specified mode. |
|
Generates a composite spatial map for ST data showing cluster hubs. |
|
Plots and saves a clustered heatmap of a gene-pair matrix. |
|
|
Plots the proportional distribution of TiRank labels within different groups. |
Plots and saves the training loss curves over epochs. |
|
Plots the density distribution of prediction scores for bulk and single-cell data. |
|
Visualizes TiRank prediction scores and labels on UMAP and spatial plots. |