0

Suppose that I have a dataset and build a ML model. This dataset is updated weekly and, after that, I want to, when he updated, my model predict for new rows that appears and append it to original dataset. How I made this?

This what I tried:

import pandas as pd import numpy as np import sklearn from sklearn.model_selection import train_test_split from sklearn.model_selection import cross_val_score from sklearn.svm import SVC url = "https://raw.githubusercontent.com/jbrownlee/Datasets/master/iris.csv" names = ['sepal-length', 'sepal-width', 'petal-length', 'petal-width', 'class'] df = pd.read_csv(url, names=names) df array = df.values X = array[:,0:4] y = array[:,4] X_train, X_validation, Y_train, Y_validation = train_test_split(X, y, test_size=0.20, random_state=1) 

I skip some steps where I check the score for different models.

model = SVC(gamma='auto') model.fit(X_train, Y_train) predictions = model.predict(X_validation) 

Here I add new data to make my test:

new_data = [[5.9, 3.0, 5.7, 1.5], [4.8, 2.9, 3.0, 1.2]] df2 = pd.DataFrame(new_data, columns = ["sepal-length", "sepal-width", "petal-length", "petal-width"]) df3 = df.append(df2, ignore_index=True) df3 array2 = df3.values X2 = array2[:,0:4] predict = model.predict(X2) predict df3['pred'] = predict def final_class(row): if pd.isnull(row['class']): return row['pred'] else: return row['class'] df3['final_class'] = df3.apply(lambda x: final_class(x), axis=1) df3 

Works, but I think that is not the best way to do it. Can someone help me?

0

1 Answer 1

1

It's the right way.

Also you can do like, predict on new dataset only & append the predicted result to initially predicted dataset.

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.