Skip to main content
1 vote
0 answers
20 views

buckets is a column of type array<string>. The logic is similar to array_intersect, except only the prefix of each string in buckets (before the first -) is compared. How can I optimize the ...
Dong Ye's user avatar
  • 11
Advice
0 votes
6 replies
155 views

So I am doing some SQL aggregation transformations of a dataset and there is a certain condition that I would like to do, but not sure how. Here is a basic code block: le_test = spark.sql(""&...
BeaverFever's user avatar
3 votes
1 answer
107 views

I have a PySpark job that reads data from table a, performs some transformations and filters, and then writes the result to table b. Here’s a simplified version of the code: import pyspark.sql....
עומר אמזלג's user avatar
0 votes
0 answers
45 views

We have a scenario to read a VSAM file directly along with a copy book to understand the column lengths, we were using COBRIX library as part of spark read. However, we could the same is not properly ...
Rocky1989's user avatar
  • 409
0 votes
0 answers
62 views

I have a job on scala (v. 2.12.15) spark (v. 3.5.1) that works correctly and looks something like this: import org.apache.spark.sql.DataFrame ... val myDataFrame = myReadDataFunction(...) ....
jd_sa's user avatar
  • 1
1 vote
3 answers
98 views

does anybody know what am I doing wrong? Following is reduced code snippet working in spark-3.x but doesn't work in spark-4.x. In my usecase I need to pass complex data structure to udf (let's say ...
Jiri Humpolicek's user avatar
1 vote
0 answers
135 views

When I try to convert a pyspark DataFrame with a VariantType column to a pandas DataFrame, the conversion fails with an error 'NoneType' object is not iterable. Am I doing it incorrectly? Sample code: ...
Ghislain Fourny's user avatar
0 votes
0 answers
69 views

I am new to AWS Glue and I am facing performance issues with the following code: spark.conf.set("spark.sql.mapKeyDedupPolicy", "LAST_WIN") # Define S3 path with wildcard to match ...
Alberto's user avatar
  • 15

15 30 50 per page
1
2 3 4 5
1796