I've got a DataFrame like this:
from pyspark.sql import SparkSession from pyspark import Row spark = SparkSession.builder \ .appName('DataFrame') \ .master('local[*]') \ .getOrCreate() df = spark.createDataFrame([Row(a=1, b='', c=['0', '1'], d='foo'), Row(a=2, b='', c=['0', '1'], d='bar'), Row(a=3, b='', c=['0', '1'], d='foo')]) | a| b| c| d| +---+---+------+---+ | 1| |[0, 1]|foo| | 2| |[0, 1]|bar| | 3| |[0, 1]|foo| +---+---+------+---+ I would like to create column "e" with first element of "c" column and "f" column with second element of "c" column", to look like this:
|a |b |c |d |e |f | +---+---+------+---+---+---+ |1 | |[0, 1]|foo|0 |1 | |2 | |[0, 1]|bar|0 |1 | |3 | |[0, 1]|foo|0 |1 | +---+---+------+---+---+---+