sc_reconstruction.metrics.compute_biological_metrics#

sc_reconstruction.metrics.compute_biological_metrics(adata_true, adata_pred, *, s_genes=None, g2m_genes=None, geneset_dict=None, progeny_model=None, pathway_dict=None, cytokine_dict=None, deg_refs=None, min_cells=20)[source]#

Run the biological metrics that have their required resources.

Each metric is skipped (and reported as NaN) if the required resource is missing — so the function is safe to call with whichever inputs the user happens to have. A warning is emitted per skipped metric.

Parameters:
  • s_genes (Sequence[str] | None) – Required for cellcycle_*.

  • g2m_genes (Sequence[str] | None) – Required for cellcycle_*.

  • geneset_dict (Mapping[str, Sequence[str]] | None) – Required for coexpression. If None, the wrapper will try to fetch MSigDB Hallmark via omnipath; pass explicitly to avoid network.

  • progeny_model (DataFrame | None) – Required for pathway. If both None, fetched via decoupler.

  • pathway_dict (Mapping[str, Sequence[str]] | None) – Required for pathway. If both None, fetched via decoupler.

  • cytokine_dict (Mapping[str, Sequence[str]] | None) – Required for cytokine. No fetch fallback.

  • deg_refs (tuple[anndata.AnnData, anndata.AnnData] | None) – Optional (ref_true, ref_pred) AnnData pair for deg_*.

  • min_cells (int) – Minimum cells-per-gene cutoff forwarded to metric_cellcycle, metric_coexpression, metric_pathway and metric_deg. Lower this on small test sets (e.g. 100-cell tutorial slices) where the default of 20 would filter most cell-cycle / signature genes.

  • adata_true (anndata.AnnData)

  • adata_pred (anndata.AnnData)

Return type:

dict[str, float]