You can flatten the monthly/total columns via explode as shown below:
val df = Seq( ("Micheal", "Scott", "[email protected]", 4000, 5000, 3400, 50660), ("Dwight", "Schrute", "[email protected]", 1200, 6900, 1000, 35000), ("Kevin", "Malone", "[email protected]", 9000, 6000, 18000, 32000) ).toDF("FName","SName", "Email", "Jan 2021", "Feb 2021", "Mar 2021", "Total 2021") val moYrCols = Array("Jan 2021", "Feb 2021", "Mar 2021", "Total 2021") // (**) val otherCols = df.columns diff moYrCols val structCols = moYrCols.map{ c => val moYr = split(lit(c), "\\s+") struct(moYr(1).as("Year"), moYr(0).as("Month"), col(c).as("Value")) } df. withColumn("flattened", explode(array(structCols: _*))). select(otherCols.map(col) :+ $"flattened.*": _*). show /* +-------+-------+------------------+----+-----+-----+ | FName| SName| Email|Year|Month|Value| +-------+-------+------------------+----+-----+-----+ |Micheal| Scott| [email protected]|2021| Jan| 4000| |Micheal| Scott| [email protected]|2021| Feb| 5000| |Micheal| Scott| [email protected]|2021| Mar| 3400| |Micheal| Scott| [email protected]|2021|Total|50660| | Dwight|Schrute|[email protected]|2021| Jan| 1200| | Dwight|Schrute|[email protected]|2021| Feb| 6900| | Dwight|Schrute|[email protected]|2021| Mar| 1000| | Dwight|Schrute|[email protected]|2021|Total|35000| | Kevin| Malone| [email protected]|2021| Jan| 9000| | Kevin| Malone| [email protected]|2021| Feb| 6000| | Kevin| Malone| [email protected]|2021| Mar|18000| | Kevin| Malone| [email protected]|2021|Total|32000| +-------+-------+------------------+----+-----+-----+ */
(**) Use pattern matching in case there are many columns; for example:
val moYrCols = df.columns.filter(_.matches("[A-Za-z]+\\s+\\d{4}"))