I am running into a problem trying to convert one of the columns of a spark dataframe from a hexadecimal string to a double. I have the following code:
import spark.implicits._ case class MsgRow(block_number: Long, to: String, from: String, value: Double ) def hex2int (hex: String): Double = (new BigInteger(hex.substring(2),16)).doubleValue txs = txs.map(row=> MsgRow(row.getLong(0), row.getString(1), row.getString(2), hex2int(row.getString(3))) ) I can't share the content of my txs dataframe but here is the metadata:
>txs org.apache.spark.sql.DataFrame = [blockNumber: bigint, to: string ... 4 more fields] but when I run this I get the error:
error: type mismatch; found : MsgRow required: org.apache.spark.sql.Row MsgRow(row.getLong(0), row.getString(1), row.getString(2), hex2int(row.getString(3))) ^
I don't understand -- why is spark/scala expecting a row object? None of the examples I have seen involve an explicit conversion to a row, and in fact most of them involve an anonymous function returning a case class object, as I have above. And for some reason, googling "required: org.apache.spark.sql.Row" returns only five results, none of which pertains to my situation. Which is why I made the title so non-specific since there is little chance of a false positive. Thanks in advance!