I have a PySpark Dataframe with a column of strings. How can I check which rows in it are Numeric. I could not find any function in PySpark's official documentation.
values = [('25q36',),('75647',),('13864',),('8758K',),('07645',)] df = sqlContext.createDataFrame(values,['ID',]) df.show() +-----+ | ID| +-----+ |25q36| |75647| |13864| |8758K| |07645| +-----+ In Python, there is a function .isDigit() which returns True or False if the string contains just numbers or not.
Expected DataFrame:
+-----+-------+ | ID| Value | +-----+-------+ |25q36| False | |75647| True | |13864| True | |8758K| False | |07645| True | +-----+-------+ I would like to avoid creating a UDF.