Quickstart with Leaspy#

This example demonstrates how to quickly use Leaspy with properly formatted data.

Leaspy uses its own data container. To use it correctly, you need to provide either a CSV file or a pandas.DataFrame in long format.

Below is an example of synthetic longitudinal data illustrating how to use Leaspy:

from leaspy.datasets import load_dataset

alzheimer_df = load_dataset("alzheimer")
print(alzheimer_df.columns)
alzheimer_df = alzheimer_df[["MMSE", "RAVLT", "FAQ", "FDG PET"]]
print(alzheimer_df.head())
/home/docs/checkouts/readthedocs.org/user_builds/leaspy/checkouts/476/src/leaspy/models/stateful.py:366: SyntaxWarning: assertion is always true, perhaps remove parentheses?
  assert (
/home/docs/checkouts/readthedocs.org/user_builds/leaspy/checkouts/476/src/leaspy/models/stateful.py:371: SyntaxWarning: assertion is always true, perhaps remove parentheses?
  assert (
Index(['E-Cog Subject', 'E-Cog Study-partner', 'MMSE', 'RAVLT', 'FAQ',
       'FDG PET', 'Hippocampus volume ratio'],
      dtype='object')
                      MMSE     RAVLT       FAQ   FDG PET
ID     TIME
GS-001 73.973183  0.111998  0.510524  0.178827  0.454605
       74.573181  0.029991  0.749223  0.181327  0.450064
       75.173180  0.121922  0.779680  0.026179  0.662006
       75.773186  0.092102  0.649391  0.156153  0.585949
       75.973183  0.203874  0.612311  0.320484  0.634809

The data correspond to repeated visits (TIME index) of different participants (ID index). Each visit corresponds to the measurement of 4 different outcomes : the MMSE, the RAVLT, the FAQ and the FDG PET.

`{warning} You **MUST** include both `ID` and `TIME`, either as indices or as columns. The remaining columns should correspond to the observed variables (also called features or endpoints). Each feature should have its own column, and each visit should occupy one row. `

`{warning} - Leaspy supports *linear* and *logistic* models. - The features **MUST** be increasing over time. - For logistic models, data must be rescaled between 0 and 1. `

from leaspy.io.data import Data, Dataset

data = Data.from_dataframe(alzheimer_df)
dataset = Dataset(data)

The core functionality of Leaspy is to estimate the group-average trajectory of the variables measured in a population. To do this, you need to choose a model. For example, a logistic model can be initialized and fitted as follows:

from leaspy.models import LogisticModel

model = LogisticModel(name="test-model", source_dimension=2)
model.fit(
    dataset,
    "mcmc_saem",
    seed=42,
    n_iter=100,
    progress_bar=False,
)
 ==> Setting seed to 42
/home/docs/.cache/pypoetry/virtualenvs/leaspy-I9H4Ohq6-py3.11/lib/python3.11/site-packages/torch/__init__.py:1240: UserWarning: torch.set_default_tensor_type() is deprecated as of PyTorch 2.1, please use torch.set_default_dtype() and torch.set_default_device() as alternatives. (Triggered internally at /pytorch/torch/csrc/tensor/python_tensor.cpp:434.)
  _C._set_default_tensor_type(t)

Fit with `AlgorithmName.FIT_MCMC_SAEM` took: 3s

Leaspy can also estimate the individual trajectories of each participant. This is done using a personalization algorithm, here scipy_minimize:

individual_parameters = model.personalize(
    dataset, "scipy_minimize", seed=0, progress_bar=False
)
print(individual_parameters.to_dataframe())
 ==> Setting seed to 0
/home/docs/checkouts/readthedocs.org/user_builds/leaspy/checkouts/476/src/leaspy/algo/personalize/scipy_minimize.py:632: UserWarning: In `scipy_minimize` you requested `use_jacobian=True` but it is not implemented in your model test-model. Falling back to `use_jacobian=False`...
  warnings.warn(

Personalize with `AlgorithmName.PERSONALIZE_SCIPY_MINIMIZE` took: 36s
        sources_0  sources_1        tau        xi
ID
GS-001   0.519938   0.350398  78.325272 -0.347083
GS-002  -0.727816  -0.153288  77.347145 -0.584833
GS-003  -0.232581  -0.895145  77.246941  0.065604
GS-004   0.139597  -0.115736  78.953514  0.428237
GS-005   0.236418  -1.880063  85.567032 -0.010424
...           ...        ...        ...       ...
GS-196   0.479222  -1.056817  73.665787  0.314039
GS-197   0.532045   1.018136  81.426926 -0.557547
GS-198  -0.120547  -0.098079  84.575027  0.161426
GS-199  -0.014911  -2.901282  94.287285 -0.155679
GS-200   0.926549  -0.820352  77.081177  0.781429

[200 rows x 4 columns]

To go further;

  1. See the [User Guide](../user_guide.md) and full API documentation.

  2. Explore additional [examples](./index.rst).

Total running time of the script: (0 minutes 48.930 seconds)

Gallery generated by Sphinx-Gallery