Revisions to Performance tuning of SQL Server query, with LEFT OUTER JOIN vs INNER JOIN

Clarified the role of unpivot.

edited Sep 18, 2023 at 15:05

1.7k
9
13

The condition WHERE [TVALOR] IS NOT NULL AND [TVALOR] != '' in the unpivoted table is much more selective than SQLServer estimates.

In the fast plan, it estimates that it will select 216056 rows out of 423640, but it actually produces only 11944 rows.

With the LEFT JOINs, Sqlserver starts from ExchangeArticleCode_CharacteristicCode, which is the left table in the LEFT JOIN, producing 11944 rows, then matches them to the other big tables, but that doesn't increases the number of rows, because apparently those rows don't match more than one article or characteristic.

When you switch to INNER JOIN (or add a NOT NULL check, which tells the optimizer it can treat the LEFT JOIN as an INNER JOIN) Sqlserver can reorder the joins to what it thinks it will be most efficient.

It assumes that unpivoting the ANART00F table will multiplicate the rows, so it postpones it to the very end.

It correctly assumes that the join between ANART00F and the Articles table will be efficient (it produces 21182 rows), but then it joins them with the Characteristic table. Since the join condition requires the unpivoted columns, this is actually a full join without conditions and produces 115 millions rows. Those areThe result is then joined with the upivoted table to become 2 billion rows. Only then, sqlserver appplies the where conditions, reducing those 2 billions down to 11934 (wherewhile sqlserver expected them to reduce the result set to 165 millions).

To correct this, you can try to update statistics for all involved tables, but I suspect that the specific values of the records you are trying to select and the unpivoting cannot be guessed properly even by updated statistics.

Another solution you can try is to use SET FORCEPLAN to force the query optimizer to join the tables in the order you specified

SET FORCEPLAN ON; -- your query here SET FORCEPLAN OFF;

The condition WHERE [TVALOR] IS NOT NULL AND [TVALOR] != '' is much more selective than SQLServer estimates.

In the fast plan, it estimates that it will select 216056 rows out of 423640, but it actually produces only 11944 rows.

With the LEFT JOINs, Sqlserver starts from ExchangeArticleCode_CharacteristicCode, which is the left table in the LEFT JOIN, producing 11944 rows, then matches them to the other big tables, but that doesn't increases the number of rows, because apparently those rows don't match more than one article or characteristic.

When you switch to INNER JOIN (or add a NOT NULL check, which tells the optimizer it can treat the LEFT JOIN as an INNER JOIN) Sqlserver can reorder the joins to what it thinks it will be most efficient.

It correctly assumes that the join between ANART00F and the Articles table will be efficient (it produces 21182 rows), but then it joins them with the Characteristic table and produces 115 millions rows. Those are then joined with the upivoted table to become 2 billion rows. Only then, sqlserver appplies the where conditions, reducing those 2 billions down to 11934 (where sqlserver expected them to reduce the result set to 165 millions).

To correct this, you can try to update statistics for all involved tables, but I suspect that the specific values of the records you are trying to select and the unpivoting cannot be guessed properly even by updated statistics.

Another solution you can try is to use SET FORCEPLAN to force the query optimizer to join the tables in the order you specified

SET FORCEPLAN ON; -- your query here SET FORCEPLAN OFF;

The condition WHERE [TVALOR] IS NOT NULL AND [TVALOR] != '' in the unpivoted table is much more selective than SQLServer estimates.

In the fast plan, it estimates that it will select 216056 rows out of 423640, but it actually produces only 11944 rows.

With the LEFT JOINs, Sqlserver starts from ExchangeArticleCode_CharacteristicCode, which is the left table in the LEFT JOIN, producing 11944 rows, then matches them to the other big tables, but that doesn't increases the number of rows, because apparently those rows don't match more than one article or characteristic.

When you switch to INNER JOIN (or add a NOT NULL check, which tells the optimizer it can treat the LEFT JOIN as an INNER JOIN) Sqlserver can reorder the joins to what it thinks it will be most efficient.

It assumes that unpivoting the ANART00F table will multiplicate the rows, so it postpones it to the very end.

It correctly assumes that the join between ANART00F and the Articles table will be efficient (it produces 21182 rows), but then it joins them with the Characteristic table. Since the join condition requires the unpivoted columns, this is actually a full join without conditions and produces 115 millions rows The result is then upivoted to become 2 billion rows. Only then, sqlserver appplies the where conditions, reducing those 2 billions down to 11934 (while sqlserver expected them to reduce the result set to 165 millions).

To correct this, you can try to update statistics for all involved tables, but I suspect that the specific values of the records you are trying to select and the unpivoting cannot be guessed properly even by updated statistics.

Another solution you can try is to use SET FORCEPLAN to force the query optimizer to join the tables in the order you specified

SET FORCEPLAN ON; -- your query here SET FORCEPLAN OFF;

Added SET FORCEPLAN suggestion

Source Link

edited Sep 18, 2023 at 14:51

Andrea B.

1.7k
9
13

The condition WHERE [TVALOR] IS NOT NULL AND [TVALOR] != '' is much more selective than SQLServer estimates.

In the fast plan, it estimates that it will select 216056 rows out of 423640, but it actually produces only 11944 rows.

With the LEFT JOINs, Sqlserver starts from ExchangeArticleCode_CharacteristicCode, which is the left table in the LEFT JOIN, producing 11944 rows, then matches them to the other big tables, but that doesn't increases the number of rows, because apparently those rows don't match more than one article or characteristic.

When you switch to INNER JOIN (or add a NOT NULL check, which tells the optimizer it can treat the LEFT JOIN as an INNER JOIN) Sqlserver can reorder the joins to what it thinks it will be most efficient.

It correctly assumes that the join between ANART00F and the Articles table will be efficient (it produces 21182 rows), but then it joins them with the Characteristic table and produces 115 millions rows. Those are then joined with the upivoted table to become 2 billion rows. Only then, sqlserver appplies the where conditions, reducing those 2 billions down to 11934 (where sqlserver expected them to reduce the result set to 165 millions).

To correct this, you can try to update statistics for all involved tables, but I suspect that the specific values of the records you are trying to select and the unpivoting cannot be guessed properly even by updated statistics.

Another solution you can try is to use SET FORCEPLAN to force the query optimizer to join the tables in the order you specified

SET FORCEPLAN ON; -- your query here SET FORCEPLAN OFF;