LogisticModel#

Module: leaspy.models.logistic Inherits from: RiemanianManifoldModel, LogisticInitializationMixin

The LogisticModel is the concrete implementation that defines the shape of disease progression as a logistic sigmoid curve. Building on the geometric framework of RiemanianManifoldModel, it specifies the actual equation that transforms the reparametrized time into biomarker values.

This is the most commonly used shape in Leaspy, suitable for biomarkers that follow an S-shaped trajectory from normality (0) to pathology (1).

The Mathematical Shape#

The core contribution of this class is defining the function \(S(t)\) and the metric \(g(p)\).

  1. The Logistic Function:

    The model assumes that after time reparametrization and spatial mixing, the value of the \(k\)-th feature follows a logistic sigmoid curve. In the code (model_with_sources), this is implemented using torch.sigmoid:

    \[ y_k(t) = \text{sigmoid}(\text{logit}_k(t)) \]

    The “logit” is calculated using the Riemannian metric \(G_k\) and the parameter \(g_k\):

    \[ \text{logit}_k(t) = G_k \cdot (v_{0,k} \cdot \tilde{t}_k + \delta_k) - \ln(g_k) \]
  2. The Metric (\(G\)) vs Parameter (\(g\)):

    Be careful to distinguish between the parameter \(g\) (which controls the position) and the Riemannian metric \(G\) (which scales the speed). The model explicitly defines their relationship:

    \[ G_k = \frac{(1+g_k)^2}{g_k} \]

    In the code:

    • metric corresponds to the Riemannian metric \(G_k\).

    • g corresponds to the shape parameter \(g_k\), which determines the value of the logistic curve at \(t=0\) (before time-shifts are applied).

Responsibilities#

  • Equation Definition: Implements model_with_sources using the sigmoid formula.

  • Variable Specification: Adds the specific parameter g (related to the inflection point valid range) to the model’s variable dictionary.

Key Attributes & Parameters#

  • g / log_g: A population parameter specific to the logistic shape.

  • v0 / log_v0: Inherited from RiemanianManifoldModel, representing the initial velocity.

Key Methods#

  • model_with_sources(...): The engine room. It computes, for each patient \(i\), timepoint \(t\), and feature \(k\):

    1. Logit (w_model_logit): The logit is the inverse of the sigmoid — an unconstrained real number that gets squashed to \((0,1)\) at the last step. Working in logit space is necessary because the model needs to add contributions linearly (population time, individual space shifts, inflection offset), which is only valid before the nonlinear sigmoid is applied. The full expression is:

      \[\text{logit}_{i,t,k} = \underbrace{G_k \cdot v_{0,k} \cdot \tilde{t}_{i,t}}_{\text{population time}} + \underbrace{G_k \cdot \delta_{i,k}}_{\text{patient space shift}} - \underbrace{\ln(g_k)}_{\text{inflection offset}}\]

      In code: metric[pop_s] * (v0[pop_s] * rt + space_shifts[:, None, ...]) - log(g[pop_s])

      Why \(G_k\) appears here: \(G_k = (1+g_k)^2/g_k\) is a population-level geometric normalizer — identical for all patients. It guarantees that any individual time reparametrisation (\(\tau_i\), \(\xi_i\)) or space shift (\(\delta_{i,k}\)) in logit space produces another valid logistic curve of the exact same shape. Without it, shifting the logit would distort the S-curve rather than purely translate it.

      The w_ prefix in w_model_logit signals it is a WeightedTensor — it carries an observation mask alongside the values to handle missing data. The mask is extracted via WeightedTensor.get_filled_value_and_weight before passing to torch.sigmoid (which cannot process NaNs directly).

    2. Sigmoid Activation: torch.sigmoid(model_logit) maps the logit to \((0, 1)\), producing the final biomarker estimate. The observation mask is then re-applied via WeightedTensor(...).weighted_value so that missing observations remain masked throughout the NLL computation.

Initialization Logic (Mixin)#

Because non-linear models are sensitive to starting values, this class inherits initialization logic from LogisticInitializationMixin. This separation keeps the model definition clean from the heuristic estimation code.

See LogisticInitializationMixin for details on how we estimate initial g, v0, and tau from raw data before the main algorithm runs.

Next Steps#

This concludes the logistic model definition hierarchy.

  • To understand how this model connects to noisy data, you might look at Observation Models.

  • To see how parameters are estimated, look at the Algorithms (e.g., McmcSaem).