Skip to main content
2 votes
1 answer
55 views

I have a data set like the following and want to scale the data using any of the scalers in sklearn.preprocessing. Is there an easy way to fit this scaler not over the whole data set, but per group? ...
ascripter's user avatar
  • 6,315
1 vote
1 answer
624 views

I am working on a fairly simple machine learning problem in the form of a practicum. I am using the following code to preprocess the data: from preprocess.date_converter import DateConverter from ...
Santiago's user avatar
1 vote
1 answer
199 views

I am trying to pass a parameter DummyTransformer__feature_index_sec to my sklearn custom transformer via a pipeline. It seems like I need to implement metadata routing in order to do this. However, I ...
Jake Drew's user avatar
  • 2,350
-2 votes
1 answer
64 views

In below code of pipeline. Even though i have encoded the sex column, i am getting string to float error. from sklearn.compose import ColumnTransformer from sklearn.pipeline import Pipeline from ...
Abubakker Hashmi's user avatar
0 votes
1 answer
53 views

I’m using a Pipeline in scikit-learn to combine feature scaling with a classifier. This works well for logistic regression, but I’m curious if this approach would generalize effectively to more ...
Celine Habashy's user avatar
1 vote
1 answer
115 views

If I run the following Python code it works well: target = 'churn' tranOH = ColumnTransformer([ ('one', OneHotEncoder(drop='first', dtype='int'), make_column_selector(dtype_include='category', ...
skan's user avatar
  • 7,790
0 votes
0 answers
82 views

I am trying to deploy the model as a .pkl file. When making the pipeline, i am facing some problems. Here is the code that causes no trouble: from sklearn.pipeline import FunctionTransformer, ...
snoisia's user avatar
0 votes
0 answers
42 views

I would like to implement Onehot encoding and label encoding to my dataset using Pipeline into my random forest model. I have created a function that utilize pipeline from scikit learn together with ...
Stackie's user avatar
  • 13
0 votes
0 answers
68 views

I am building a scikit-learn pipeline. I downloaded a dataset from an online ML repository and generated descriptive stats for it. I am using the processed.cleveland.data dataset found here: https://...
EngineerP's user avatar
0 votes
0 answers
207 views

I am trying to implement a pipeline with sklearn combining a column transformer for numeric and categorical data and sequential feature selection. The issue is when doing the complete pipeline it gets ...
Tlaltecutli's user avatar
0 votes
1 answer
384 views

I have an example data, where one column contains string values (e.g "34 12"). I created two new columns during the preprocessing step, storing the right and left integers of the string ...
user avatar
1 vote
1 answer
122 views

I'm trying to understand scikit-learn Pipelines. According to a Note in the scikit user guide a Pipeline "has all the methods that the last estimator in the pipeline has". So I wrote my own ...
Evan Aad's user avatar
  • 6,063
-1 votes
1 answer
222 views

I'm trying to build a pipeline that contains a pre-processing transformer (it simply removes columns from the data) and an LDA classifier. I wanted to tweak hyperparameters for each, and from looking ...
Ted's user avatar
  • 1
2 votes
3 answers
806 views

I want to select columns based on their datetime data types. My DataFrame has for example columns with types np.dtype('datetime64[ns]'), np.datetime64 and 'datetime64[ns, UTC]'. Is there a generic way ...
JAdel's user avatar
  • 1,636
0 votes
1 answer
347 views

I am having trouble with a piece of code I am writing. Specifically a pipeline. The data is a simple numerical dataframe (firewall logs) which is being split in X_train and X_test very commonly. After ...
GEBRU's user avatar
  • 529

15 30 50 per page
1
2 3 4 5
7