Skip to main content

Questions tagged [preprocessing]

Data preprocessing is a data mining technique that involves transforming raw data into a better understandable or more useful format.

0 votes
1 answer
44 views

I'm working on binary classification problem to identify struggling students in university. I have some features that are correlated such as high_school_grade_1 that represents 75% of ...
Youness Belhaj's user avatar
4 votes
0 answers
29 views

I have a large dataset (~10M points) in python and I want to filter it using a large number of different custom masks, as part of calculations to create a new but related dataset. Because the dataset ...
quail's user avatar
  • 41
7 votes
1 answer
97 views

I'm trying to train a CNN model to identify phytoplankton species from a training set. During preprocessing, the images are resized to 224x224, which seems to be stretching or compressing the object ...
Charlottefaf's user avatar
1 vote
1 answer
67 views

I'm working with high-dimensional biological data (∼41,000 features × 3,979 samples from RNA-seq for 2 conditions). Here’s a simplified version of my preprocessing and filtering pipeline before ...
Adi Gershon's user avatar
7 votes
1 answer
96 views

I am currently working on a dataset that has two columns: customerID and date. I want to find the minimum date for each customerID. Initially, I used the following code: ...
Guna's user avatar
  • 897
1 vote
0 answers
29 views

I am training a model on multiple cache miss examples from various trace simulations. For every trace I have thousands of miss examples stored and I have many traces. I'm storing the examples in ...
Saffy's user avatar
  • 11
0 votes
0 answers
30 views

I am currently working on preprocessing big data dataset for ML purposes. I am struggling with encoding strings as numbers. I have a dataset of multiple blockchain transactions and I have addresses of ...
Asic's user avatar
  • 21
1 vote
1 answer
40 views

I want to feed the amplitude of stationary timeseries into transformer. I'm planning to tokenize/bin the amplitude into discrete value. So, the transformer learn from unique integer token instead of ...
Muhammad Ikhwan Perwira's user avatar

15 30 50 per page
1
2 3 4 5
36