Timeline for Are there existing term(s) for a 1-1 child-parent table anti-pattern?
Current License: CC BY-SA 4.0
26 events
| when toggle format | what | by | license | comment | |
|---|---|---|---|---|---|
| Jun 27, 2023 at 17:10 | comment | added | candied_orange | "My ugly car wont start" "You need to paint it" "How's that going to help?" "It wont be so ugly". | |
| Jun 6, 2023 at 15:21 | comment | added | JimmyJames | "There is arguably a redundancy in even storing weekday number for each date, since weekday can be algorithmically determined from the date." Don't get hung up on the example, it's not literally what they are doing but a very good approximation of the 'approach'. There's an instance of one of these tables that has a PK and an int column consisting of only positive single digit ints. There are hundreds of thousands of rows with the value 3. There's no reason that the child tables couldn't have used the value 3 directly instead holding the key (e.g. 955529). This is OLTP BTW. | |
| Jun 6, 2023 at 14:58 | comment | added | JimmyJames | This isn't helping performance in any way. It's objectively and empirically a major performance drag. Consider how this affects caching. On the client there is a huge cache (high memory use) which accomplishes almost nothing for these tables (because no rows are reused across lookups). On the DB, it causes the LRU cache to constantly churn. | |
| Jun 6, 2023 at 14:54 | comment | added | JimmyJames | "Unless other tables in the database have a foreign key to the second table, the partitioning does not seem useful." There are foreign keys to these tables from other tables. Some of these have a dozen or more tables referencing them but AFAICT, they never share any rows. That is, if there were twelve tables using the same data values (e.g. Saturday of week), there are separate rows each usage from each table i.e. child A and child B never have an FK to the same row. | |
| Jun 4, 2023 at 17:14 | history | edited | JacquesB | CC BY-SA 4.0 | added 76 characters in body |
| Jun 4, 2023 at 16:39 | history | edited | JacquesB | CC BY-SA 4.0 | added 313 characters in body |
| Jun 4, 2023 at 12:03 | history | edited | JacquesB | CC BY-SA 4.0 | added 1395 characters in body |
| Jun 2, 2023 at 15:43 | comment | added | JacquesB | Let us continue this discussion in chat. | |
| Jun 2, 2023 at 15:25 | comment | added | JimmyJames | I don't thing you are following. Thanks for trying. | |
| Jun 1, 2023 at 22:09 | comment | added | JacquesB | @JimmyJames if the two tables are 1:1 then it is not a lookup table, it is a vertical partition. Lookup tables and vertical portions are independent concerns - you can have both, a vertical partition and a lookup table for the unique values. Or you can have either, or none. Perhaps you think the vertical partition was supposed to be a lookup table but just badly designed? This is of course possible but probably more likely it is deliberate use of vertical partitioning. | |
| Jun 1, 2023 at 18:19 | comment | added | JimmyJames | I don't know what else to tell you to get you to understand that the 700K rows are in the lookup table. | |
| Jun 1, 2023 at 17:53 | comment | added | JimmyJames | So, you think it would be normal to have 700K rows in a days of the week table with only 7 distinct rows (aside from the primary key)? Is that how you normalize data? | |
| Jun 1, 2023 at 17:50 | comment | added | JacquesB | @JimmyJames So the be clear, it is not a question of either 700 K values or just 7 values. If you need to add information to 700 K entities, you need a column of 700 K rows either way, regardless of the size of the data type, The 7 rows in a lookup table would be in addition to that. | |
| Jun 1, 2023 at 17:42 | comment | added | JacquesB | @JimmyJames If you need to associate a day-of-the-week with 700K entities (and there are no non-key functional dependencies otherwise), then you would need a column with 700K values. If you add a lookup table for the seven possible values, then you would need 700K + 7 rows in all. | |
| Jun 1, 2023 at 17:36 | comment | added | JimmyJames | So 7 rows, not 700K? | |
| Jun 1, 2023 at 17:29 | comment | added | JacquesB | @JimmyJames Let's say you have a table with a million persons, a row for each, with a column indicating the day of the week they were born. OK so you have lots of duplicates in this row, since there are only seven days of the week. Now you extract the day of the week to a separate table with seven rows, and instead use the column as foreign key to this new table. You still have exactly the same number of duplicates in that column. | |
| Jun 1, 2023 at 17:17 | comment | added | JimmyJames | I'm sorry but have you actually read the question? "Duplicate values are not problematic as long as the data is otherwise normalized." OK, so if you normalized some columns into a reference table, how many duplicate values would you expect in the reference table? For example, let's say I have a 'days of the week' table, how many rows should it have? | |
| Jun 1, 2023 at 17:03 | comment | added | JacquesB | @JimmyJames OK, if I understand correctly, you are not really concerned about the vertical partitioning of the table into multiple 1:1 tables, you are concerned about the duplicate values in some columns? Duplicate values are not problematic as long as the data is otherwise normalized. If you changed the duplicate values to foreign keys to a smaller lookup table you would just get duplicate foreign keys in the main table. | |
| Jun 1, 2023 at 16:52 | comment | added | JimmyJames | I am nof conflating the issue, you are referring to something irrelevant to the situation at hand. "But if you are talking about extracting duplicate values in a column to a lookup table, that would just introduce a join." Right, which is what this is all about. | |
| Jun 1, 2023 at 16:49 | comment | added | JacquesB | @JimmyJames: I doubt it would be faster. But I think you are conflating different issues with this question. Vertical partitioning does not increase or decrease the number of rows, it increases the number of tables. But if you are talking about extracting duplicate values in a column to a lookup table, that would just introduce a join. E.g. imagine you have a boolean column and then extract the two values to lookup-table with two rows - you would still need a FK to the lookup-table for each row in the main table. This would introduce a join you may not need. | |
| Jun 1, 2023 at 16:29 | comment | added | JimmyJames | Can you address the question I asked? E.g.: How is joining a thousand rows to a thousand rows going to be better/faster than joining 1000 rows to 6 rows? The results will be identical. | |
| Jun 1, 2023 at 15:58 | history | edited | JacquesB | CC BY-SA 4.0 | added 99 characters in body |
| Jun 1, 2023 at 15:26 | comment | added | JacquesB | @JimmyJames: It is called vertical partitioning or row splitting or 6th normal form, if you want to look further into it. It can in some cases increase performance e.g if queries often scan and select from some columns significantly much more than others. But it very much depends on the particular database and workload if there is any benefit. | |
| Jun 1, 2023 at 13:41 | comment | added | JimmyJames | Explain to me how having hundreds of millions of rows with the exact same data values improves performance for joins. Consider indexes and inserts. Why would joining the child rows to a single parent row be less performant? | |
| Jun 1, 2023 at 6:21 | history | edited | JacquesB | CC BY-SA 4.0 | added 289 characters in body |
| Jun 1, 2023 at 6:14 | history | answered | JacquesB | CC BY-SA 4.0 |