Spark dataframe, df, has the following column names:
scala> df.columns res6: Array[String] = Array(Age, Job, Marital, Education, Default, Balance, Housing, Loan, Contact, Day, Month, Duration, Campaign, pdays, previous, poutcome, Approved) And sql query on df by column names works fine:
scala> spark.sql(""" select Age from df limit 2 """).show() +---+ |Age| +---+ | 30| | 33| +---+ But when I try to use withColumn on df I run into problems:
scala> val dfTemp = df.withColumn("temp", df.Age.cast(DoubleType)) .drop("Age").withColumnRenamed("temp", "Age") <console>:38: error: value Age is not a member of org.apache.spark.sql.DataFrame Above code is taken from here.
Thanks