2,858 questions
0 votes
1 answer
38 views
Upsert! Operation Throws "A table can't contain duplicate column names" Error
I have a base table A and a result table B in DolphinDB. Table B was initially empty and is used to store calculated results based on table A. When trying to insert the calculated results into table B,...
3 votes
1 answer
135 views
In Apache Ignite the Replication mode and Partition mode does not work all together
I’m working with Apache Ignite 2.17.0. I load database tables into Ignite caches and run SQL queries using the SQLFieldsQuery API. Recently, I modified the cache configuration for some tables to use ...
0 votes
0 answers
62 views
Get two different nodes to access and distribute the same SQL table in Apache spark?
I have the following code to test. I created a table on worker 1. Then I tried to read the table on worker 2 and it got TABLE_OR_VIEW_NOT_FOUND. Worker 2 is in the some computer as Master. I ran the ...
0 votes
0 answers
50 views
How to best partition my data with a 32 core EMR instance and make sure I max out the parallelize feature?
I’m optimizing a PySpark pipeline that processes records with a heavily skewed categorical column (category). The data has: A few high-frequency categories (e.g., 90% of records fall into 2-3 ...
1 vote
1 answer
114 views
Distributed REST API Calls using SPARK with maintaining consistency
I have a Spark DataFrame created from a Delta table, with one column of type STRUCT(JSON). For each row in this DataFrame, I need to make a REST API call using the JSON payload in the column. ...
0 votes
0 answers
328 views
PyTorch DDP Multi-Node Training: ncclInternalError: Internal check failed. Bootstrap : no socket interface found
I am trying to run a multi-node training job using PyTorch's DistributedDataParallel (DDP) following this guide. However, when I launch the job with torchrun, I encounter the following NCCL error on ...
1 vote
0 answers
91 views
Segmentation Fault During Validation with MirroredStrategy on Multiple GPUs
I am training a model using TensorFlow 2.18.0 with the tf.distribute.MirroredStrategy across two GPUs. The training works fine on a single GPU, but when I try to run it on two GPUs, it ends with a ...
0 votes
0 answers
81 views
Vertex AI Reduction Server returning 500 Internal Error
I am looking to finetune a pre-trained deberta model on Vertex AI with pytorch. I'm attempting to run a distributed job, making use of the Vertex AI reduction server. I'm following this notebook: ...