I need to delete certain entries from nested Json files. As far as I know, I cant just delete them from the json file directly, so my next choice would be to load them into a pyspark dataframe, delete the entries there, create a new json with the same schema (& preferably the same name) and replace the old json file. I have extracted the schema into a json file, is there a way to write the dataframe back into a json file, somehow parsing the extracted schema?
Thanks!