I'm wondering what the most efficient way would be to delete large numbers of rows from PostgreSQL, this process would be part of a recurring task every day to bulk import data (a delta of insertions + deletions) into a table. There could be thousands, potentially millions of rows to delete.
I have a file of primary keys, one per line. The two options I was thinking of were along the lines of the below, but I don't know/understand enough of the internals of PostgreSQL to make an informed decision which would be best.
- Execute a
DELETEquery for each row in the file, with a simpleWHEREon primary key (or group the deletes in batches ofnusing anIN()clause) - Import the primary keys into a temporary table using the
COPYcommand and then deleting from the main table using a join
Any suggestions will be much appreciated!