Questions tagged [parallel]

Ask Question

The parallel tag on Data Science Stack Exchange encompasses questions related to parallel computing and processing within data science workflows. This includes discussions on distributing tasks across multiple processors or machines to enhance computational efficiency.

39 questions

2 votes

1 answer

125 views

XGBoost GPU version not outperforming CPU on small dataset despite parameter tuning – suggestions needed

I'm currently working on a Parallel and Distributed Computing project where I'm comparing the performance of both XGBoost and CatBoost when trained on CPU vs GPU. The goal is to demonstrate how GPU ...

Mxneeb

asked May 3 at 21:32

2 votes

1 answer

316 views

Parallel Data preprocessing

I am looking for a suggestion. Is it possible to implement the data preprocessing steps like missing value imputation, outlier detection, normalization, label encoding in parallel? Can I implement ...

Encipher

asked Sep 8, 2022 at 14:07

1 vote

2 answers

342 views

How to load and run feature selection on a dataset with 5,000 samples and 500,000 features?

I have a dataset with 5000 samples and 500,000 features (all categorical with a cardinality of 3). Two problems I'm trying to solve: Loading the dataset - I can't load it in memory despite using a ...

applebanana_456789

asked May 20, 2021 at 2:24

12 votes

3 answers

18k views

What needs to be done to make n_jobs work properly on sklearn? in particular on ElasticNetCV?

The constructor of sklearn.linear_model.ElasticNetCV takesn_jobs as an argument. Quoting the documentation here n_jobs: int, ...

OldSchool

asked May 15, 2020 at 17:17

1 vote

0 answers

64 views

Parallelization of a MIMO linear filter

I would like to implement a Multi Input Multi Output filtering operation, acting as fast as possible on batches of data. Here is my current implementation: ...

marco

asked Apr 5, 2020 at 12:18

0 votes

3 answers

3k views

Specifying number of threads using XGBoost.train

When using the xgboost.train() function, all the threads are used. I would like to use a specific amount. Unfortunately, this function does not accept the ...

LauritsT

asked Sep 4, 2019 at 13:46

0 votes

1 answer

1k views

CUDA 8.0 is compatible with my GeForce GTX 670M Wikipedia says, but TensorFlow rises an error: GTX 670M's Compute Capability is < 3.0

According to Wikipedia, the GeForce GTX 670M has a Compute Capability of 2.1 (and a Fermi micro-architecture), which is confirmed by TensorFlow (I can read "2.1" in the error it rises). ...

JarsOfJam-Scheduler

asked Aug 1, 2019 at 20:16

1 vote

0 answers

113 views

Updating Weight Using Updates on Related Data

Suppose $$ x=Ay $$ The $x$ is $M\times 1$, $y$ is $N \times 1$ and $A$ is $M\times N$ We have the data $x$ and would like to know what $y$ is. However, the matrix $A$ is too large for pseudo-...

Varun Chhangani

asked Jun 28, 2019 at 4:23

15 30 50 per page

2 3 Next

Stack Exchange Network

Questions tagged [parallel]

XGBoost GPU version not outperforming CPU on small dataset despite parameter tuning – suggestions needed

Parallel Data preprocessing

How to load and run feature selection on a dataset with 5,000 samples and 500,000 features?

What needs to be done to make n_jobs work properly on sklearn? in particular on ElasticNetCV?

Parallelization of a MIMO linear filter

Specifying number of threads using XGBoost.train

CUDA 8.0 is compatible with my GeForce GTX 670M Wikipedia says, but TensorFlow rises an error: GTX 670M's Compute Capability is < 3.0

Updating Weight Using Updates on Related Data

Hot Network Questions