scCS.multicomparison
====================

.. py:module:: scCS.multicomparison

.. autoapi-nested-parse::

   multicomparison.py — MultiScorer: multi-condition (3+) commitment score analysis for scCS.

   Extends the pairwise PairScorer to handle 3 or more experimental conditions
   (e.g., multiple drug treatments, time points, genotypes).

   Key design principle: shared embedding
   ---------------------------------------
   All conditions are embedded in a SINGLE shared star layout built on the pooled
   data.  This is critical — if each condition had its own embedding, the arm
   angles would differ and CS values would not be comparable across conditions.

   Architecture
   ------------
   MultiScorer
       Wraps SingleScorer.  Pools all conditions for embedding, then scores
       each condition separately using cell masks on the shared embedding.

   Tier 1 — Core multi-condition API
       score_all_conditions()          : dict[condition -> CommitmentScoreResult]

   Tier 2 — Omnibus + post-hoc statistical comparison
       compare_omnibus()               : Kruskal-Wallis / ANOVA per fate
       compare_posthoc()               : Dunn / Tukey / Conover pairwise post-hoc
       compute_pairwise_deltas()       : ΔCS for ALL condition pairs with bootstrap CI

   Tier 3 — Advanced
       fit_mixed_model()               : linear mixed-effects model on per-cell
                                         fate affinity scores via statsmodels MixedLM
       fit_mixed_model_contrasts()     : LMM with custom condition contrasts
       trajectory_shift()              : KS test + Wasserstein distance on
                                         pseudotime distributions per fate arm
       plot_trajectory_shift()         : visualization of pseudotime distributions

   Usage
   -----
   >>> mscorer = scCS.MultiScorer(
   ...     adata,
   ...     root='17',
   ...     branches=['homeostatic', 'activated'],
   ...     condition_obs_key='treatment',
   ...     obs_key='leiden',
   ... )
   >>> mscorer.build_embedding(ordering_metric='pseudotime')
   >>> mscorer.fit()
   >>> results = mscorer.score_all_conditions()
   >>> omnibus = mscorer.compare_omnibus(results)
   >>> posthoc = mscorer.compare_posthoc(results, omnibus_results=omnibus)
   >>> deltas = mscorer.compute_pairwise_deltas()
   >>> shift = mscorer.trajectory_shift(results)


Classes
-------

.. autoapisummary::

   scCS.multicomparison.MultiScorer


Module Contents
---------------

.. py:class:: MultiScorer(adata, root: str, branches: List[str], condition_obs_key: str, obs_key: str = 'leiden', n_angle_bins: int = 36, sector_method: Literal['centroid', 'equal'] = 'centroid', copy: bool = False)

   RNA velocity commitment scorer for experiments with 3+ conditions.

   Builds a SHARED star embedding on the pooled data from all conditions,
   then scores each condition separately.  This ensures arm geometry is
   identical across conditions, making CS values directly comparable.

   Provides tiered statistical testing:
   - Tier 2: Omnibus tests (Kruskal-Wallis / ANOVA) followed by
     post-hoc pairwise comparisons (Dunn / Tukey / Conover).
   - Tier 3: Mixed-effects models with custom contrasts,
     trajectory shift analysis.

   :param adata: Full single-cell dataset containing all conditions.
   :type adata: AnnData
   :param root: Label of the progenitor/root cluster in adata.obs[obs_key].
   :type root: str
   :param branches: Labels of the k terminal fate clusters.
   :type branches: list of str
   :param condition_obs_key: Column in adata.obs with condition labels (e.g., 'treatment').
                             Must contain at least 3 unique values.
   :type condition_obs_key: str
   :param obs_key: Column in adata.obs with cluster labels.  Default: 'leiden'.
   :type obs_key: str
   :param n_angle_bins: Number of angular bins.  Default: 36.
   :type n_angle_bins: int
   :param sector_method: Sector definition strategy.
   :type sector_method: {'centroid', 'equal'}
   :param copy: Work on a copy of adata.
   :type copy: bool

   :raises ValueError: If condition_obs_key has fewer than 3 unique values.
       For 2 conditions, use PairScorer instead.

   .. rubric:: Examples

   >>> mscorer = MultiScorer(
   ...     adata,
   ...     root='17',
   ...     branches=['homeostatic', 'activated'],
   ...     condition_obs_key='treatment',
   ...     obs_key='leiden',
   ... )
   >>> mscorer.build_embedding(ordering_metric='pseudotime')
   >>> mscorer.fit()
   >>> results = mscorer.score_all_conditions()
   >>> omnibus = mscorer.compare_omnibus(results)
   >>> posthoc = mscorer.compare_posthoc(results, omnibus_results=omnibus)


   .. py:attribute:: adata


   .. py:attribute:: root
      :value: ''


   .. py:attribute:: branches


   .. py:attribute:: condition_obs_key


   .. py:attribute:: obs_key
      :value: 'leiden'


   .. py:attribute:: n_angle_bins
      :value: 36


   .. py:attribute:: sector_method
      :value: 'centroid'


   .. py:attribute:: conditions


   .. py:method:: build_embedding(ordering_metric: Union[str, numpy.ndarray] = 'pseudotime', invert_ordering: bool = False, scale_ordering: bool = False, arm_scale: float = 10.0, jitter: float = 0.3, seed: int = 42, arm_norm: str = 'global', verbose: bool = True) -> MultiScorer

      Build the shared star embedding on pooled data from all conditions.

      The embedding is built on ALL cells (all conditions pooled), ensuring
      that arm geometry is identical across conditions.

      :param ordering_metric: See SingleScorer.build_embedding().
      :type ordering_metric: str or np.ndarray
      :param invert_ordering:
      :type invert_ordering: bool
      :param scale_ordering:
      :type scale_ordering: bool
      :param arm_scale:
      :type arm_scale: float
      :param jitter:
      :type jitter: float
      :param seed:
      :type seed: int
      :param verbose:
      :type verbose: bool

      :rtype: self


   .. py:method:: refit_pseudotime(scale_01: bool = True, arm_scale: float = 10.0, jitter: float = 0.3, seed: int = 42, arm_norm: str = 'global', verbose: bool = True) -> MultiScorer

      Rebuild the shared embedding using subset-local pseudotime.

      See SingleScorer.refit_pseudotime().


   .. py:method:: fit(verbose: bool = True) -> MultiScorer

      Fit the shared FateMap and project velocity.

      Must be called after build_embedding().

      :rtype: self


   .. py:method:: score_all_conditions(cell_level: bool = True, k_nn: Optional[int] = None, n_bootstrap: int = 0, bootstrap_ci: float = 0.95, verbose: bool = True) -> Dict[str, scCS.scores.CommitmentScoreResult]

      Compute commitment scores separately for each condition.

      Uses the shared embedding and FateMap.  Each condition's cells are
      masked from the shared adata_sub, so arm geometry is identical.

      :param cell_level: Compute per-cell fate affinity scores.
      :type cell_level: bool
      :param k_nn: NN-smoothed entropy neighbors.
      :type k_nn: int, optional
      :param n_bootstrap: Bootstrap replicates for CI.  0 = disabled.
      :type n_bootstrap: int
      :param bootstrap_ci: CI level for bootstrap.
      :type bootstrap_ci: float
      :param verbose:
      :type verbose: bool

      :returns: **dict**
      :rtype: condition_label -> CommitmentScoreResult


   .. py:method:: compare_omnibus(results: Dict[str, scCS.scores.CommitmentScoreResult], test: Literal['kruskal', 'anova'] = 'kruskal', pval_threshold: float = 0.05, verbose: bool = True) -> pandas.DataFrame

      Omnibus test across all conditions per fate.

      For each fate arm, tests whether per-cell affinity scores differ
      across ALL conditions simultaneously.

      - 'kruskal': Kruskal-Wallis H test (non-parametric, recommended default)
      - 'anova': One-way ANOVA (parametric, assumes normality)

      :param results: Output of score_all_conditions() with cell_level=True.
      :type results: dict
      :param test: Statistical test to use.  Default: 'kruskal'.
      :type test: {'kruskal', 'anova'}
      :param pval_threshold: Significance threshold for flagging.  Default 0.05.
      :type pval_threshold: float
      :param verbose:
      :type verbose: bool

      :returns: fate, test, statistic, pval, pval_adj, significant, n_conditions
      :rtype: pd.DataFrame with columns


   .. py:method:: compare_posthoc(results: Dict[str, scCS.scores.CommitmentScoreResult], omnibus_results: Optional[pandas.DataFrame] = None, method: Literal['dunn', 'tukey', 'conover'] = 'dunn', pval_correction: Literal['fdr', 'bonferroni', 'holm'] = 'fdr', pval_threshold: float = 0.05, verbose: bool = True) -> pandas.DataFrame

      Post-hoc pairwise comparisons across conditions per fate.

      Only meaningful after an omnibus test rejects H0. If omnibus_results
      is provided, post-hoc is only run for fates where omnibus p < threshold.

      .. method:: - 'dunn': Dunn's test with rank-based comparisons (non-parametric,

         recommended with Kruskal-Wallis).  Uses scikit-posthocs.

      .. method:: - 'tukey': Tukey HSD (parametric, for balanced designs, with ANOVA).

      .. method:: - 'conover': Conover-Iman test (more powerful than Dunn, non-parametric).

         Uses scikit-posthocs.
         

      .. method:: Multiple testing correction applied across all pairwise comparisons

      .. method:: within each fate arm.

         
      :param results: Output of score_all_conditions() with cell_level=True.
      :type results: dict
      :param omnibus_results: Output of compare_omnibus().  If provided, post-hoc is only run
                              for fates where omnibus pval_adj < pval_threshold.
      :type omnibus_results: pd.DataFrame, optional
      :param method: Post-hoc test method.  Default: 'dunn'.
      :type method: {'dunn', 'tukey', 'conover'}
      :param pval_correction: Multiple testing correction method.  Default: 'fdr'.
      :type pval_correction: {'fdr', 'bonferroni', 'holm'}
      :param pval_threshold: Significance threshold.  Default 0.05.
      :type pval_threshold: float
      :param verbose:
      :type verbose: bool

      :returns: fate, comparison, method, statistic, pval, pval_adj, significant,
                mean_A, mean_B, delta_mean
      :rtype: pd.DataFrame with columns


   .. py:method:: compute_pairwise_deltas(n_bootstrap: int = 500, ci: float = 0.95, seed: int = 42, verbose: bool = True) -> Dict[Tuple[str, str], Dict]

      Compute ΔCS for ALL condition pairs with bootstrap CI.

      Unlike PairScorer.compute_delta_CS() which takes two specific conditions,
      this computes delta for every pair in the condition set.

      :param n_bootstrap: Number of bootstrap replicates.  Default 500.
      :type n_bootstrap: int
      :param ci: Confidence interval level.  Default 0.95.
      :type ci: float
      :param seed:
      :type seed: int
      :param verbose:
      :type verbose: bool

      :returns: (same structure as PairScorer.compute_delta_CS() output).
      :rtype: dict mapping (cond_a, cond_b) -> delta_result dict


   .. py:method:: fit_mixed_model(results: Dict[str, scCS.scores.CommitmentScoreResult], replicate_key: Optional[str] = None, ref_condition: Optional[str] = None, verbose: bool = True) -> pandas.DataFrame

      Linear mixed-effects model on per-cell fate affinity scores.

      Models per-cell fate affinity as a function of condition (fixed effect)
      with optional sample/replicate as a random effect.

      Model (per fate j):
          affinity_ij ~ condition_i + (1 | sample_id_i)

      Uses statsmodels MixedLM.

      :param results: Output of score_all_conditions(cell_level=True).
      :type results: dict
      :param replicate_key: Column in adata_sub.obs with sample/replicate IDs.
      :type replicate_key: str, optional
      :param ref_condition: Reference condition for the fixed effect.
      :type ref_condition: str, optional
      :param verbose:
      :type verbose: bool

      :returns: fate, condition, coef, std_err, z_score, pval, pval_adj,
                ci_low, ci_high, significant
      :rtype: pd.DataFrame with columns


   .. py:method:: fit_mixed_model_contrasts(results: Dict[str, scCS.scores.CommitmentScoreResult], contrasts: Optional[List[Tuple[str, str]]] = None, replicate_key: Optional[str] = None, ref_condition: Optional[str] = None, pval_threshold: float = 0.05, verbose: bool = True) -> pandas.DataFrame

      Linear mixed-effects model with custom condition contrasts.

      Extends fit_mixed_model() to test specific condition comparisons
      within the LMM framework (more powerful than separate models).

      If contrasts is None, tests each condition vs ref_condition.
      If contrasts is provided, tests each specified pair, e.g.:
          [('drug_A', 'control'), ('drug_B', 'control'), ('drug_A', 'drug_B')]

      Uses statsmodels MixedLM with Wald tests on contrast coefficients.

      :param results: Output of score_all_conditions(cell_level=True).
      :type results: dict
      :param contrasts: Pairs of conditions to compare.  If None, all conditions vs
                        ref_condition are tested.
      :type contrasts: list of (str, str), optional
      :param replicate_key: Column in adata_sub.obs with sample/replicate IDs.
      :type replicate_key: str, optional
      :param ref_condition: Reference condition.  Required when contrasts is None.
      :type ref_condition: str, optional
      :param pval_threshold: Significance threshold.  Default 0.05.
      :type pval_threshold: float
      :param verbose:
      :type verbose: bool

      :returns: fate, contrast, coef, std_err, z_score, pval, pval_adj, significant
      :rtype: pd.DataFrame with columns


   .. py:method:: trajectory_shift(results: Dict[str, scCS.scores.CommitmentScoreResult], pseudotime_key: str = 'sccs_pseudotime', n_bootstrap: int = 500, seed: int = 42, verbose: bool = True) -> pandas.DataFrame

      Test whether pseudotime distributions differ across conditions per fate arm.

      For each fate arm and each pair of conditions, computes:
      - Kolmogorov-Smirnov (KS) statistic and p-value
      - Wasserstein distance (Earth Mover's Distance)
      - Bootstrap CI on the Wasserstein distance

      :param results: Output of score_all_conditions().
      :type results: dict
      :param pseudotime_key: Column in adata_sub.obs with pseudotime values.
      :type pseudotime_key: str
      :param n_bootstrap: Bootstrap replicates for Wasserstein CI.  Default 500.
      :type n_bootstrap: int
      :param seed:
      :type seed: int
      :param verbose:
      :type verbose: bool

      :returns: fate, comparison, ks_stat, ks_pval, wasserstein,
                wasserstein_ci_low, wasserstein_ci_high,
                mean_pt_A, mean_pt_B, delta_mean_pt, significant
      :rtype: pd.DataFrame with columns


   .. py:method:: plot_trajectory_shift(shift_df: pandas.DataFrame, pseudotime_key: str = 'sccs_pseudotime', color_map: Optional[Dict[str, str]] = None, figsize: Optional[Tuple[float, float]] = None, title: Optional[str] = None, save_path: Optional[str] = None) -> matplotlib.figure.Figure

      Visualize pseudotime distributions per condition per fate arm.

      Produces a grid of KDE plots: one row per fate arm, one column per
      pairwise comparison.  Overlaid KDEs show how pseudotime distributions
      shift between conditions.

      :param shift_df: Output of trajectory_shift().
      :type shift_df: pd.DataFrame
      :param pseudotime_key:
      :type pseudotime_key: str
      :param color_map:
      :type color_map: dict, optional
      :param figsize:
      :type figsize: tuple, optional
      :param title:
      :type title: str, optional
      :param save_path:
      :type save_path: str, optional

      :returns: **fig**
      :rtype: matplotlib Figure


   .. py:method:: plot_omnibus_summary(omnibus_df: pandas.DataFrame, results: Dict[str, scCS.scores.CommitmentScoreResult], posthoc_df: Optional[pandas.DataFrame] = None, figsize: Optional[Tuple[float, float]] = None, save_path: Optional[str] = None, vmin: Optional[float] = None, vmax: Optional[float] = None) -> matplotlib.figure.Figure

      Summary heatmap: fates × conditions showing omnibus significance.

      Left panel: heatmap of mean per-cell affinity per fate per condition,
      annotated with omnibus p-value stars.
      Right panel (if posthoc provided): significant pairwise comparisons.

      :param omnibus_df: Output of compare_omnibus().
      :type omnibus_df: pd.DataFrame
      :param results: Output of score_all_conditions().
      :type results: dict
      :param posthoc_df: Output of compare_posthoc().
      :type posthoc_df: pd.DataFrame, optional
      :param figsize:
      :type figsize: tuple, optional
      :param save_path:
      :type save_path: str, optional
      :param vmin: Color limits for the mean-affinity heatmap.  If both are
                   ``None`` (default), they are derived from the finite values
                   of the affinity matrix so the colormap spans the actual data
                   range.  Set explicitly to pin a fixed scale across figures.
      :type vmin: float, optional
      :param vmax: Color limits for the mean-affinity heatmap.  If both are
                   ``None`` (default), they are derived from the finite values
                   of the affinity matrix so the colormap spans the actual data
                   range.  Set explicitly to pin a fixed scale across figures.
      :type vmax: float, optional

      :returns: **fig**
      :rtype: matplotlib Figure


   .. py:method:: plot_posthoc_heatmap(posthoc_df: pandas.DataFrame, fate: Optional[str] = None, figsize: Optional[Tuple[float, float]] = None, save_path: Optional[str] = None) -> matplotlib.figure.Figure

      Condition × condition heatmap of post-hoc p-values for a given fate.

      Lower triangle: p-values.  Upper triangle: delta mean affinity.
      Annotated with significance stars.

      :param posthoc_df: Output of compare_posthoc().
      :type posthoc_df: pd.DataFrame
      :param fate: Which fate to plot.  If None, uses the first fate with
                   significant results.
      :type fate: str, optional
      :param figsize:
      :type figsize: tuple, optional
      :param save_path:
      :type save_path: str, optional

      :returns: **fig**
      :rtype: matplotlib Figure


   .. py:method:: plot_pairwise_delta_grid(delta_results: Dict[Tuple[str, str], Dict], figsize_per_panel: Tuple[float, float] = (4, 4), cmap: str = 'RdBu_r', save_path: Optional[str] = None) -> matplotlib.figure.Figure

      Grid of ΔCS heatmaps for all condition pairs.

      Each panel shows ΔnCS = nCS_A − nCS_B for one condition pair, with
      bootstrap CI half-width annotated below each entry. Inherits the same
      layout as :func:`scCS.plot.plot_delta_cs_heatmap` but renders all pairs
      on a single shared figure.

      :param delta_results: Output of compute_pairwise_deltas().
      :type delta_results: dict
      :param figsize_per_panel:
      :type figsize_per_panel: tuple
      :param cmap: Diverging colormap. Default 'RdBu_r'.
      :type cmap: str
      :param save_path:
      :type save_path: str, optional

      :returns: **fig**
      :rtype: matplotlib Figure


   .. py:method:: transfer_labels(results: Dict[str, scCS.scores.CommitmentScoreResult], prefix: str = 'cs_') -> None

      Transfer per-cell commitment scores to the full adata for all conditions.

      Calls SingleScorer.transfer_labels() for each condition's result,
      writing condition-specific columns to adata.obs.

      :param results: Output of score_all_conditions(cell_level=True).
      :type results: dict
      :param prefix: Column prefix.  Default: 'cs_'.
      :type prefix: str


   .. py:method:: plot_star(result: scCS.scores.CommitmentScoreResult, **kwargs)

      Radial star embedding plot.


   .. py:method:: plot_star_grid(results: Dict[str, scCS.scores.CommitmentScoreResult], color_map: Optional[Dict[str, str]] = None, figsize_per_panel: Tuple[float, float] = (6, 6), save_path: Optional[str] = None) -> matplotlib.figure.Figure

      Side-by-side star embedding plots, one per condition.


   .. py:method:: plot_rose_grid(results: Dict[str, scCS.scores.CommitmentScoreResult], color_map: Optional[Dict[str, str]] = None, figsize_per_panel: Tuple[float, float] = (5, 5), title: Optional[str] = None, save_path: Optional[str] = None) -> matplotlib.figure.Figure

      Grid of polar rose plots — one per condition.


   .. py:method:: plot_affinity_distributions(results: Dict[str, scCS.scores.CommitmentScoreResult], plot_type: Literal['violin', 'box', 'strip'] = 'violin', color_map: Optional[Dict[str, str]] = None, figsize: Optional[Tuple[float, float]] = None, title: Optional[str] = None, save_path: Optional[str] = None) -> matplotlib.figure.Figure

      Violin/box plots of per-cell fate affinity scores by condition.


   .. py:method:: plot_delta_cs_heatmap(delta_result: dict, **kwargs) -> matplotlib.figure.Figure

      Heatmap of ΔCS = nCS_A − nCS_B with CI annotation.


   .. py:method:: plot_compare_conditions_bar(results: Dict[str, scCS.scores.CommitmentScoreResult], **kwargs) -> matplotlib.figure.Figure

      Grouped bar chart of nCS per condition.


   .. py:method:: plot_commitment_vector_radar(results: Dict[str, scCS.scores.CommitmentScoreResult], **kwargs) -> matplotlib.figure.Figure

      Radar / spider chart of commitment vectors per condition.


   .. py:property:: scorer
      :type: Optional[scCS.single.SingleScorer]


      The internal SingleScorer used for embedding and scoring.


   .. py:property:: adata_sub

      The embedding subset (from the internal SingleScorer).


   .. py:property:: is_fitted
      :type: bool