1,313 questions
0 votes
0 answers
50 views
Create a new line for comma separated values in pandas column - I dont want to add new rows, I want to have same rows in output [duplicate]
I have a dataframe like this, df col1 col2 1 'abc,pqr' 2 'ghv' 3 'mrr, jig' Now I want to create a new line for each comma separated values in col2, so the output would look ...
0 votes
1 answer
118 views
Timestamp issue while creating the model using pipeline in Vertex AI
I am currently utilizing the XGBoost classifier within a pipeline that includes normalization and the XGBoost model itself. The model has been successfully developed in the Notebook environment. The ...
0 votes
1 answer
49 views
Cross-Validation Function returns "Unknown label type: (array([0.0, 1.0], dtype=object),)"
Here is the full error: `--------------------------------------------------------------------------- ValueError Traceback (most recent call last) Cell In[33], line 2 ...
11 votes
2 answers
122k views
How to use DataFrameMapper to delete rows with a null value in a specific column?
I am using sklearn-pandas.DataFrameMapper to preprocess my data. I don't want to impute for a specific column. I just want to drop the row if this column is Null. Is there a way to do that?
1 vote
2 answers
90 views
ElasticNetCV in Python: Get full grid of hyperparameters with corresponding MSE?
I have fitted a ElasticNetCV in Python with three splits: import numpy as np from sklearn.linear_model import LinearRegression #Sample data: num_samples = 100 # Number of samples num_features = 1000 ...
2 votes
3 answers
111 views
Pandas takes all columns of a dataframe even when some columns are specified
I am trying to train KMeans model using Scikit-Learn. I am stuck on this issue for 2 days. Pandas is selecting all columns of a dataframe even though I specified 2 columns. Here is the dataframe in ...
0 votes
0 answers
27 views
_fit_method for KNN gives KD-tree even though I'm working in a high dimensional spce
so since KNeighborsClassifier class in sklearn find the best algorithm depending on the values from fit method when using auto (which is the default), when accessing the algorithm using ._fit_method I ...
1 vote
2 answers
68 views
Using SKLearn KMeans With Externally Generated Correlation Matrix
I receive a correlation file from an external source. It is a fairly straightforward file and looks like the following. A sample csv can be found here https://www.dropbox.com/scl/fi/...
0 votes
2 answers
106 views
Using a Mask to Insert Values from sklearn Iterative Imputer
I created a set of random missing values to practice with a tree imputer. However, I'm stuck on how to overwrite the missing values into the my dataframe. My missing values look like this: from ...
0 votes
1 answer
236 views
model.fit() class weights do not work when training the model
when calculating classes_weight with from sklearn.utils import class_weight class_weights = class_weight.compute_class_weight(class_weight="balanced", classes=np.unique(...
0 votes
1 answer
45 views
Data cardinality is ambiguous sklearn.train
model.fit(x_train, y_train, epochs=1000) i'm trying to make a ai but mine code gives a error and i don't how to fix it? this is the error ValueError: Data cardinality is ambiguous: x sizes: 455 y ...
0 votes
1 answer
232 views
Mlflow log_figure deletes artifact
I am running mlflow with autologging to track an xgboost model. By default, under artifacts it saves the model, requirements, and feature importances. Cool stuff I want to keep. But, if I try to add ...
1 vote
1 answer
76 views
multiple linear regression house price r2 score problem
I Have Sample House Price Data And Simple Code : import pandas as pd from sklearn.preprocessing import LabelEncoder, StandardScaler from sklearn.model_selection import train_test_split from sklearn....
0 votes
1 answer
118 views
How to transform Dataframe Mapper to PMML?
I want to use multiple PMMLs to keep the transformation of the data and the application of the model separate. Here is the code I am using. I am doing this because I want to include some kind of ...
1 vote
1 answer
222 views
How to get immediate neighbors using a kd-tree irrespective of the spacing?
I want to find the immediate neighbours around a given point in a multidimensional space (up to 7 dimensions). Important facts about the space: non-linear spacing among points within a single ...