Skip to content

Pollybs/dataiku_ML_heart_attack_prediction

Repository files navigation

Dataiku Machine Learning: Heart Failures Prediction

Project Objective: Build a precise machine learning predictive model using the Dataiku DSS (Dataiku Data Science Studio) to forecast heart failure incidents accurately.
The model was trained and tested using Python and the Heart Failure Prediction Dataset.

Dataiku DSS(Dataiku Data Science Studio) is a Big Data solution and predictive analysis software developed by the French publisher Dataiku. It offers pre-built capabilities to evaluate, deploy & monitor Machine Learning models.

Project Steps:

Using Python notebooks and Dataiku Machine Learning experiment tracking capabilities, I went through:

1 - Configuration of the Dataiku DSS environment and project,

2 - Data preparation and EDA

Checking the distribution of the target variable

Transformation of categorical variables into dummies

Scaling of continuous variables

Splitting the dataset (train/test)

3 - Configuration of the Dataiku Flow

4 - Machine learning experimentation: the test of different Machine Learning approaches to predict heart failures using scikit-learn models

Scikit-learn models models tested:

  • Logistic regression
  • SVM
  • Decision Tree
  • Random Forest

a) For each model, a grid search was performed to find the best hyper parameters

b) Then the model was trained on the train set using these best parameters and cross-validation

c) Everything (parameters, performance metrics, and models) was logged in the Daitaku Experiment Tracking (MLFlow framework) to keep track of the results of the different experiments and be able to compare afterward.

Model evaluation

dataiku

Dataset

EDA

Dataset Source: Heart Failure Prediction Dataset. Retrieved from Kaggle

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors