Push RedShift table to S3 by doing some aggregation as CSV

Question

I have been looking to the best way to programatically pull Redshift table (table needs to be aggregated) into s3.

What would be the best solution. From Athena to s3 I found this article however, I could not find any information to do it from Redshift to s3.

https://www.datastackpros.com/2020/07/export-athena-view-as-csv-to-aws-s3.html

I would be daily ingestion and the csv file should be overwritten.

Thanks

Bill Weiner · Accepted Answer · 2022-08-24 15:53:46Z

There are 2 ways that come to mind right away - UNLOAD and CREATE EXTERNAL TABLE. Each has its pros and cons. Your use case isn't completely clear as to what you need the resulting file(s) to look like but let me take a guess.

I expect you need a single CSV file (with or without header row?) for other tools to read / use. In this case I'd use UNLOAD with PARALLEL OFF to save the result of the query to S3. This will produce 1 file in S3 ONLY IF the resulting size is less than 5GB.

Thanks for your answer. I understand the unload method would be the best approach. Would you kindly advise if the file gets updates every time there is a update in the table or there is a need to do it manually. What would be the best approach to automate it? Thanks
It would be manual. The S3 file would be an extract from the database and would not change until another UNLOAD overwrites it.

Collectives™ on Stack Overflow

Push RedShift table to S3 by doing some aggregation as CSV

1 Answer 1

2 Comments

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Related