0

I have 4 dataframes which only have one row and one column, and I would like to combine them into one dataframe. In python i would do this using the zip function but I need a way to do it in pyspark. Any suggestions?

Dataframes look like this:

+--------------------------+ |sum(sum(parcelUBLD_SQ_FT))| +--------------------------+ | 1.13014806E8| +--------------------------+ +---------------------+ |sum(parcelUBLD_SQ_FT)| +---------------------+ | 1.13014806E8| +---------------------+ +---------------+ |count(parcelID)| +---------------+ | 45932| +---------------+ +----------------+ |sum(parcelCount)| +----------------+ | 45932| +----------------+ 

and I would like it to look like this:

+--------------------------+---------------------+---------------+----------------+ |sum(sum(parcelUBLD_SQ_FT))|sum(parcelUBLD_SQ_FT)|count(parcelID)|sum(parcelCount)| +--------------------------+---------------------+---------------+----------------+ | 1.13014806E8| 1.13014806E8| 45932| 45932| +--------------------------+---------------------+---------------+----------------+ 
2

1 Answer 1

1

Since, you clearly specified all dataframes are having one row, you can use cross join to get the desired output

df1.crossJoin(df2).crossJoin(df3).crossJoin(df4) 
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.