Redshift Unload command with CSV extension

Question

I'm using the following Unload command -

unload ('select * from '')to 's3://**summary.csv**' CREDENTIALS 'aws_access_key_id='';aws_secret_access_key=''' parallel off allowoverwrite CSV HEADER;

The file created in S3 is summary.csv000

If I change and remove the file extension from the command like below

unload ('select * from '')to 's3://**summary**' CREDENTIALS 'aws_access_key_id='';aws_secret_access_key=''' parallel off allowoverwrite CSV HEADER;

The file create in S3 is summary000

Is there a way to get summary.csv, so I don't have to change the file extension before importing it into excel?

Thanks.

Vladimir Putsenkov · Accepted Answer · 2023-03-02 15:34:07Z

Its a bit late but they recently added EXTENSION parameter that solves this problem:

1.0.45698 – Released on January 20, 2023 Release notes for this version: - Adds a file extension parameter to the UNLOAD command, so file extensions are automatically added to filenames.

Great news! Link - docs.aws.amazon.com/redshift/latest/dg/…

TheDataGuy · Accepted Answer · 2020-11-20 01:54:06Z

actually a lot of folks asked the similar question, right now it's not possible to have an extension for the files. (but parquet files can have)

The reason behind this is, RedShift by default export it in parallel which is a good thing. Each slice will export its data. Also from the docs,

PARALLEL

By default, UNLOAD writes data in parallel to multiple files, according to the number of slices in the cluster. The default option is ON or TRUE. If PARALLEL is OFF or FALSE, UNLOAD writes to one or more data files serially, sorted absolutely according to the ORDER BY clause, if one is used. The maximum size for a data file is 6.2 GB. So, for example, if you unload 13.4 GB of data, UNLOAD creates the following three files.

So it has to create new files after 6GB that's why they are adding numbers as a suffix.

How do we solve this?

No native options from RedShift, but we can do some workaround with lambda.

Create a new S3 bucket and a folder inside it specifically for this process.(eg: s3://unloadbucket/redshift-files/)
Your unload files should go to this folder.
Lambda function should be triggered based on S3 put object event.
Then the lambda function,
1. Download the file(if it is large use EFS)
2. Rename it with .csv
3. Upload to the same bucket(or different bucket) into a different path (eg: s3://unloadbucket/csvfiles/)

Or even more simple if you use shell/powershell script to do the following process

Download the file
Rename it with .csv

unfortunately, still its not possible with unload command @bigdataadd
this answer is no more correct see the answer -> stackoverflow.com/a/75617611/6227500

Vitor Alves · Accepted Answer · 2023-01-31 17:20:50Z

As per AWS Documentation around UNLOAD command, it's possible to save data as CSV.

In your case, this is what your code would look like:

unload ('select * from '') to 's3://summary/' CREDENTIALS 'aws_access_key_id='';aws_secret_access_key=''' CSV <<< parallel off allowoverwrite CSV HEADER;

Piero Terreros · Accepted Answer · 2023-07-13 21:23:57Z

0

Add to command EXTENSION 'csv'

answered Jul 13, 2023 at 21:23

Piero Terreros

1

1 Comment

Community Over a year ago

Your answer could be improved with additional supporting information. Please edit to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers in the help center.

Adrian Mole · Accepted Answer · 2023-09-14 15:30:34Z

0

unload ('select * from '')to 's3://**summary.csv**' CREDENTIALS 'aws_access_key_id='';aws_secret_access_key=''' DELIMITER ',' HEADER EXTENSION 'csv' ALLOWOVERWRITE PARALLEL OFF;

edited Sep 14, 2023 at 15:30

Adrian Mole

52.1k193 gold badges61 silver badges101 bronze badges

answered Sep 11, 2023 at 16:23

SHIVANG GUPTA

11 bronze badge

1 Comment

Community Over a year ago

As it’s currently written, your answer is unclear. Please edit to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers in the help center.

Collectives™ on Stack Overflow

Redshift Unload command with CSV extension

5 Answers 5

1 Comment

How do we solve this?

3 Comments

Comments

1 Comment

1 Comment

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

1 Comment

How do we solve this?

3 Comments

Comments

1 Comment

1 Comment

Related