1

Been trying to run a consolidation query in AWS Redshift(Type: RA3.4xLarge) where the length of the query is around 6k characters( I know, pretty huge!!).

Now this query fails with Below error.

psycopg2.errors.InternalError_: Value too long for character type DETAIL: ----------------------------------------------- error: Value too long for character type code: 8001 context: Value too long for type character varying(1) query: 388111 location: string.cpp:175 process: query0_251_388111 [pid=13360] ----------------------------------------------- 

On further digging, found that the stl_query table logs every query run on the cluster and this has a 4k character limit on the column querytxt which leads to the above failure of the entire query.

 View "pg_catalog.stl_query" Column | Type | Modifiers ----------------------------+-----------------------------+----------- userid | integer | query | integer | label | character(320) | xid | bigint | pid | integer | database | character(32) | querytxt | character(4000) | starttime | timestamp without time zone | endtime | timestamp without time zone | aborted | integer | insert_pristine | integer | concurrency_scaling_status | integer | 

So, the question here is (apart from reducing the query length), is there any work-around this situation ? Or am I deducing this whole thing wrong?

1 Answer 1

3

You are off track on your analysis - I've worked on many Redshift queries over 4K in size (unfortunately). Unless they broke something in a recent release this isn't your issue.

First off stl_query only stores the first 4K characters of the query - if you want to see entire long queries you need to look at the table stl_querytext. Be aware that stl_querytext breaks the query into 200 character chunks and you need to reassemble them using the sequence values.

Now your error is pointing at a varchar(1), one byte size varchar, as the source of the issue. I would look at your query and tables to find varchar(1) sized columns and see what is being put into them. You will likely need to understand what length string is being inserted and why to find your solution.

Remember that in Redshift non-ascii characters are stored in more than one byte and that varchar(1) is a one byte column. So if you try to insert a single non-ascii character into this column it will not fit. Two functions could be of use if the issue is related to non-ascii characters:

  • length() - find the length of a string in CHARACTERS
  • octet_length() - find the length of a string in BYTES

If you still have trouble finding the issue you may want to post the DDL and SQL you are using so you can get more specific recommendations.

Sign up to request clarification or add additional context in comments.

4 Comments

Thanks Bill. Let me look a little more into the base query. Will get back to you.
@bill just curious, in redshift the data type portion is having the size of the data(eg: name varchar(10) - name column can hold up to 10 bytes) Then how it is possible to insert more than 1 bytes(as per the question) and its giving error while selecting the data?
A VARCHAR(10) can hold at most 10 ascii characters but fewer characters is some are multi-byte. A non-ascii character can require 2, 3, or 4 bytes to store. There's no way that a single character can fill a varchar(10) but 3 possibly could. (2 byte characters are fairly common but 4's are rare.) Does this info help?
Thanks @BillWeiner for the right track. You were right.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.