Merge 4 dataframes into one

Question

I have 4 dataframes which only have one row and one column, and I would like to combine them into one dataframe. In python i would do this using the zip function but I need a way to do it in pyspark. Any suggestions?

Dataframes look like this:

+--------------------------+ |sum(sum(parcelUBLD_SQ_FT))| +--------------------------+ | 1.13014806E8| +--------------------------+ +---------------------+ |sum(parcelUBLD_SQ_FT)| +---------------------+ | 1.13014806E8| +---------------------+ +---------------+ |count(parcelID)| +---------------+ | 45932| +---------------+ +----------------+ |sum(parcelCount)| +----------------+ | 45932| +----------------+

and I would like it to look like this:

+--------------------------+---------------------+---------------+----------------+ |sum(sum(parcelUBLD_SQ_FT))|sum(parcelUBLD_SQ_FT)|count(parcelID)|sum(parcelCount)| +--------------------------+---------------------+---------------+----------------+ | 1.13014806E8| 1.13014806E8| 45932| 45932| +--------------------------+---------------------+---------------+----------------+

Possible duplicate of Spark: Merge 2 dataframes by adding row index/number on both dataframes — Chris
– Chris, Commented Apr 29, 2019 at 17:39

Ranga Vure · Accepted Answer · 2019-04-29 19:41:15Z

Since, you clearly specified all dataframes are having one row, you can use cross join to get the desired output

df1.crossJoin(df2).crossJoin(df3).crossJoin(df4)

Collectives™ on Stack Overflow

Merge 4 dataframes into one

1 Answer 1

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Linked

Related