Timeline for Will an index on a uuid column be smaller in Postgres than MySQL / InnoDB?
Current License: CC BY-SA 4.0
14 events
| when toggle format | what | by | license | comment | |
|---|---|---|---|---|---|
| Nov 2, 2022 at 16:25 | comment | added | Rick James | Keep in mind that you pay 20 bytes for every index that includes it. In InnoDB, if the PRIMARY KEY is a UUID, then every secondary index has the same 20-byte cost. | |
| Nov 2, 2022 at 16:22 | comment | added | Rick James | "more optimization involved than "just" the 20 bytes" - InnoDB has a lot of overhead in its BTrees. 2x-3x is a useful Rule of Thumb. | |
| Nov 2, 2022 at 16:19 | comment | added | Rick James | MariaDB 10.7 also has a UUID datatype with builtin conversion. | |
| Nov 2, 2022 at 9:45 | comment | added | Akina | @sbrattla You ask about the size of separate secondary index, is it? if so then the size of the index by textual UUID is ~twice while comparing with the index by the above expression. I.e. replacing the index by immediate value with the index by an expression will decrease total disk size occupied by the index. | |
| Nov 2, 2022 at 9:41 | comment | added | sbrattla | @Akina I assume an expression index on UUID_TO_BIN(column) will be a secondary index, and not reduce the overall size of indexes. | |
| Nov 2, 2022 at 9:23 | comment | added | Akina | Oops... I loose that the question asks about the index.. sorry. For an index - in MySQL the index by an expression (UUID_TO_BIN(column)) may be created. Or by generated column with this expression. In last option the column may be virtual, and it does not occupy the space in the table body. | |
| Nov 2, 2022 at 9:22 | comment | added | user1822 | @Akina: the "row size" for a single column index is always the same regardless of the size of the table - it's the size of that column. The question is about the size of the index, not the size of the table. The size of the table might be smaller as well, but that is more complicated to measure because of the other columns (padding) as you correctly point out. | |
| Nov 2, 2022 at 9:17 | comment | added | Akina | You does not take into account the granularity. For example, if average row size is 1 kb and block size is 16 kb then the block may contain up to 14 records. If the row size decreases by 20 bytes then the amount of records per block won't change.. | |
| Nov 2, 2022 at 8:28 | comment | added | sbrattla | @a_horse_with_no_name thank's for doing an actual test! | |
| Nov 2, 2022 at 8:27 | comment | added | user1822 | @sbrattla: I just did a simple test and created an index on 50 million UUIDs - the index size is 1.5GB | |
| Nov 2, 2022 at 8:27 | history | edited | user1822 | CC BY-SA 4.0 | added 458 characters in body |
| Nov 2, 2022 at 8:27 | comment | added | sbrattla | @LaurenzAlbe for my particular case, going from char to binary would require a lot of application logic to be changed to convert back and forth to binary. The reason I'm looking at Postgres is that it has the uuid data type, and the move from MySQL to Postgres would require few application changes. | |
| Nov 2, 2022 at 8:10 | comment | added | Laurenz Albe | I'd use a varbinary on MySQL, which should result in a roughly similar index. | |
| Nov 2, 2022 at 8:08 | history | answered | user1822 | CC BY-SA 4.0 |