A Scikit-Learn-like Python library for Longitudinal Machine Learning — Paper · Documentation
Scikit-longitudinal (Sklong) is a machine learning library tailored for Longitudinal machine (supervised) learning (Classification tasks focussed as of today). It offers tools and models for processing, analysing, and predicting longitudinal data, with a user-friendly interface that integrates with the Scikit-learn ecosystem.
Wait, what is Longitudinal Data — In layman's terms?
Longitudinal data is a "time-lapse" snapshot of the same subject, entity, or group tracked over time-periods, similar to checking in on patients to see how they change. For instance, doctors may monitor a patient's blood pressure, weight, and cholesterol every year for a decade to identify health trends or risk factors. This data is more useful for predicting future results than a one-time (cross-sectional) survey because it captures evolution, patterns, and cause-effect throughout time.
See more in the documentation.
To install Scikit-longitudinal:
pip install Scikit-longitudinalTo install a specific version:
pip install Scikit-longitudinal==0.1.0Tip
Want to use Jupyter Notebook/Lab, Google Colab or want to activate parallelism? Head to the Getting Started section of the documentation, we explain it all! 🎉
Let's run a simple Longitudinal machine learning classification task:
from scikit_longitudinal.data_preparation import LongitudinalDataset from scikit_longitudinal.estimators.ensemble.lexicographical.lexico_gradient_boosting import LexicoGradientBoostingClassifier dataset = LongitudinalDataset('./stroke.csv') # Note, this is a fictional dataset. Use yours! dataset.load_data_target_train_test_split( target_column="class_stroke_wave_4", ) # Pre-set or manually set your temporal dependencies dataset.setup_features_group(input_data="elsa") model = LexicoGradientBoostingClassifier( features_group=dataset.feature_groups(), threshold_gain=0.00015 # Refer to the API for more hyper-parameters and their meaning ) model.fit(dataset.X_train, dataset.y_train) y_pred = model.predict(dataset.X_test) # Classification report print(classification_report(y_test, y_pred))If you use Sklong in your research, please cite our paper:
We would like to personally thank Prof. Lengerich (UW Madison—@blengerich & @AdaptInfer), & Prof. Tahiri (Université de Sherbrooke—@TahiriNadia & @tahiri-lab) for their amazing peer reviews!
@article{Provost2025, doi = {10.21105/joss.08481}, url = {https://doi.org/10.21105/joss.08481}, year = {2025}, publisher = {The Open Journal}, volume = {10}, number = {112}, pages = {8481}, author = {Provost, Simon and Freitas, Alex A.}, title = {Scikit-Longitudinal: A Machine Learning Library for Longitudinal Classification in Python}, journal = {Journal of Open Source Software} }Scikit-longitudinal is licensed under the MIT License.