How to fine tune a huge problematic Heap Table? [closed]

Question

Problem:

A huge table is causing to a critical performance issue on a very critically important production database (MS SQL Server 2016 SP1 Standard Edition).

Facts about the problematic Table :

Heap Table without index.
450 GB of Size.
More than 190 million records.
More than 900 MB/s Physical I/O activity every time a simple SELECT Query executes upon. Worst if the query includes WHERE clause.
Queries executed upon this table are causing 4-5 minutes delays on the application UI.

Suggested performance tuning steps with respect to suggested solution:

Truncate db table (agreed with customer to delete records older than 1 year).
- Don't Delete Records. Copy records created in last 1 year to a temp table. Truncate the original table, and then INSERT the records you have copied from the temp table to the emptied original table.
Make a de-fragmentation on the db table.
Migrate the db table from heap to clustered table.
Create a Clustered Index.
Create a maintenance job and schedule it to re-build the clustered index periodically.

My question to community:

How would you improve the performance of such a problematic table? What would you add to or remove from the suggested performance tuning steps?

Much appreciated and thanks a lot in advance.

Edited from here on:

OK, I didn't want to reply to all comments 1 by 1. So I'm editing the question and adding some more information.

1st : I am not an expert DBA. I would say an advanced beginner. That's why I am here to learn from the community as much as I can.

2nd : We just got this customer 2 weeks ago, and I have not got full access to the environment myself. This information (facts) about the problem, about the table is directly from the customer's sys admins.

3rd : There for, I have no idea why the table is Heap in the first hand, without an index, and they let it to grow this much without any partitioning and what so ever.

4th : Suggested Solution and Suggested Steps are not my own but directly from the manufacturer of the whole system (it's a well know Unified Communications System and the manufacturer is the best in this area. I guess you all know which one).

5th : I will hopefully have a full access to the environment next week and before that I just wanted to share what I got in hand and try to get as much opinion/suggestion as I could prior to going into this mess.

6th : Because unfortunately I am the one who is going to take this mess from here and try to make it better and smoother working mess.

Thanks to everyone who already spare some time and replied. I already got some insight. And I will definitely share more info as soon as I have more in hand.

Welcome to DBA.SE. It would help us help you if you could provide us with the DDL of the table involved. Just creating a clustered index, for the sake of creating a clustered-index could be counter-productive, depending on which columns you choose for the CI. As pointed out in the answer by Tibor there are still some open questions, as the steps you have listed are sometimes the same thing, Please edit your question and add more details. Thanks. — John K. N.
– John K. N., Commented Jan 25, 2021 at 14:07
I like Tibor and David's answers. I'd follow those. If you're really getting into query & performance tuning for the first time, I'd strongly suggest getting a copy of my book on query tuning. No link because I'm not trying to sell stuff here. I just want to help out. — Grant Fritchey
– Grant Fritchey, Commented Jan 26, 2021 at 12:13

David Browne - Microsoft · Accepted Answer · 2021-01-25 14:51:54Z

5

Based on those facts, 190M rows, 450GB, un-indexed heap, long-running queries, you should create a Clustered Columnstore index and store the data in compressed columnar format.

However, that might not be the optimal solution, as you still have a table with no key and no indexes, which is strange.

answered Jan 25, 2021 at 14:51

David Browne - Microsoft

49.2k3 gold badges53 silver badges102 bronze badges

Just curious what your thoughts are to why a Clustered Columnstore Index as opposed to Clustered Rowstore Index? I've found many great uses for columnstore indexes in my experience for improving performance of OLAP type queries on my rowstore tables, but without OP commenting on the structure of his data or the type of querying they do against it, is it still possible to determine when a columnstore index makes sense? (Not being sarcastic, seriously generally curious if I'm missing something about columnstore indexes that I could be using :).

J.D.
– J.D.

2021-01-25 20:07:40 +00:00
Commented Jan 25, 2021 at 20:07
It's unindexed, so it's being constantly scanned. Columnstores give you a tripple-whammy of compression, column elimination, and row group elimination.

David Browne - Microsoft
– David Browne - Microsoft

2021-01-25 20:44:38 +00:00
Commented Jan 25, 2021 at 20:44
1

Ah gotcha, going off the fact that he's getting table scans all day anyway (being indexless), one solution would be to minimize the amount of data being scanned based on the features of columnstore indexing (such as the compression), is what you're saying? If so, that makes sense to me, thanks!

J.D.
– J.D.

2021-01-25 20:52:39 +00:00
Commented Jan 25, 2021 at 20:52

Add a comment |

John K. N. · Accepted Answer · 2021-01-25 14:09:29Z

I'm a bit confused regarding your "suggested ... steps".

Some are the same things (migrate from heap to clustered table is the same thing as creating a clustered index).

It is unclear whether you have them as compliments or alternatives to each others.

Anyhow, you have basically two options:

ALTER TABLE ... REBUILD to get rid of all that empty space, which is probably the reason for your issues. I.e., keep it as a heap.
Transform the table to a clustered table, by creating a clustered index.

Most of us prefer to have clustered indexes on our tables unless you have a good reason not to. I.e., we can't say what is best for you. Is there a reason that your table is a heap? If so, what is the reason? How strong is it, compared to the disadvantages that you have discovered? Etc.

As for defragmentation of the clustered index: It isn't certain that this will benefit you. Many DBAs defragment "just because". If you can point to advantages of doing defrag (less "jumping around on disk" and potentially higher fullness of pages), then by all means.

In my opinion he just collected every suggestion from different sources and put them together, he see no difference between truncate and delete, clustered/heap. Even adding some indexes to this table will help him but how to explain it to someone who don't understand that EVERY query to this table (except for select top (N)) reads the whole table? — sepupic
– sepupic, Commented Jan 25, 2021 at 14:12

Francesco Mantovani · Accepted Answer · 2021-01-25 21:13:36Z

This is the approach I would take:

You are on SQL Server 2016, right? Enable Query Store, it will tell you how to approach the performance issue
This is a critical database in production, right? It must be in Always On availability group, right? Point all SELECT to the second replica while UPDATE, ALTER and INSERT will happen on the first replica. This will reduce at least the read problem
This table is too big, right? That's why table partitioning was invented! Divide your table by month, or maybe by year, or maybe by groups of 1 million rows based on your ID column, etc... you have to understand what your WHERE are looking for. You have it? So partition that table by your most wanted column.
What about THOSE two settings? I know I might sound banal but have you setup Cost Threshold for Parallelism and MAXDOP ?
Everyone want that table, right? Yeah but some queries are more important than others, so... Use Resource Governor and prioritise those queries that really need resources
190 Millions rows and 450Gb database? No one wants to admit that but this happens more often than it should. There are many companies out there with one table that takes 20-30% of the whole database. This is because when the database was created no one cared about what would have happened in 5-10 years. And now we have to clean all that mess. You have to sit down an project yourself in the future. What will be that database in 1, 3, 5, 10 years time? How to prevent problems? You need to reorganize your database architecture and make it future proof. Because it goes by itself that this database cannot go to the cloud.

...and one day you will have to go there, want it or not.

Stack Exchange Network

How to fine tune a huge problematic Heap Table? [closed]

Problem:

Facts about the problematic Table :

Suggested solution by the application/system manufacturer:

Suggested performance tuning steps with respect to suggested solution:

My question to community:

Edited from here on:

3 Answers 3

Hot Network Questions

How to fine tune a huge problematic Heap Table? [closed]

Problem:

Facts about the problematic Table :

Suggested solution by the application/system manufacturer:

Suggested performance tuning steps with respect to suggested solution:

My question to community:

Edited from here on:

3 Answers 3

Related

Hot Network Questions