2

I am playing around with Apache Spark with the Azure CosmosDB connectors in Scala and was wondering if anyone had examples or insight on how I would write my DataFrame back to a collection in my CosmosDB. Currently I am able to connect to my one collection and return the data and manipulate it but I want to write the results back to a different collection inside the same database.

I created a writeConfig that contains my EndPoint, MasterKey, Database, and the Collection that I want to write to.

I then tried writing it to the collection using the following line.

manipulatedData.toJSON.write.mode(SaveMode.Overwrite).cosmosDB(writeConfig) 

This runs fine and does not display any errors but nothing is showing up in my collection.

I went through the documentation I could find at https://github.com/Azure/azure-cosmosdb-spark but did not have much luck with finding any examples of writing data back to the database.

If there is an easier way to write to a documentDB/cosmosDB than what I am doing? I am open to any options.

Thanks for any help.

1 Answer 1

5

You can save to Cosmos DB directly from a Spark DataFrame just like you had noted. You may not need to use toJSON, for example:

// Import SaveMode so you can Overwrite, Append, ErrorIfExists, Ignore import org.apache.spark.sql.{Row, SaveMode, SparkSession} // Create new DataFrame `df` which has slightly flights information // i.e. change the delay value to -999 val df = spark.sql("select -999 as delay, distance, origin, date, destination from c limit 5") // Save to Cosmos DB (using Append in this case) // Ensure the baseConfig contains a Read-Write Key // The key provided in our examples is a Read-Only Key df.write.mode(SaveMode.Append).cosmosDB(baseConfig) 

As for the documentation, you are correct in that the save function should be have been better called out. I've created Include in User Guide / sample scripts how to save to Cosmos DB #91 to address this.

As for the saving but seeing no error, by any chance is your config using the Read-Only key instead of the Read-write key? I just created Saving to CosmosDB using read-only key has no error #92 calling out the same issue.

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.