Return to Answer

Post Undeleted by Paul White♦

occurred May 9, 2024 at 6:26

Post Deleted by Mike Walsh

occurred May 9, 2024 at 2:05

Minor typo fixes.

Source Link

edit approved Mar 11, 2024 at 12:43

J. Mini

1.4k
1
10
36

Switching from simple to gullfull has a gotcha.

0 = Nothing

What it sounds like.. Shouldn't be waiting.
1 = Checkpoint

Waiting for a checkpoint to occur. This should happen and you should be fine - but there are some cases to look for here for later answers or edits.
2 = Log backup

You are waiting for a log backup to occur. Either you have them scheduled and it will happen soon, or you have the first problem described here and you now know how to fix it.
3 = Active backup or restore

A backup or restore operation is running on the database.
4 = Active transaction

There is an active transaction that needs to complete (either way - ROLLBACK or COMMIT) before the log can be backed up. This is the second reason described in this answer.
5 = Database mirroring

Either a mirror is getting behind or under some latency in a high-performance mirroring situation or mirroring is paused for some reason.
6 = Replication

*There can be issues with replication that would cause this - like a log reader agent not running, a database thinking it is marked for replication that no longer is and various other reasons.

You can also see this reason and it is perfectly normal because you are looking at just the right time, just as transactions are being consumed by the log reader*reader.*
7 = Database snapshot creation

When creating a database snapshot, you'll see this if you look at just the right moment as a snapshot is being created.
8 = Log Scan

I have yet to encounter an issue with this running along forever. If you look long enough and frequently enough you can see this happen, but it shouldn't be a cause of excessive transaction log growth, that I've seen.
9 = An Availability Group secondary replica is applying transaction log records of this database to a corresponding secondary database. About the clearest description yet..

Switching from simple to gull has a gotcha.

0 = Nothing

What it sounds like.. Shouldn't be waiting
1 = Checkpoint

Waiting for a checkpoint to occur. This should happen and you should be fine - but there are some cases to look for here for later answers or edits.
2 = Log backup

You are waiting for a log backup to occur. Either you have them scheduled and it will happen soon, or you have the first problem described here and you now know how to fix it
3 = Active backup or restore

A backup or restore operation is running on the database
4 = Active transaction

There is an active transaction that needs to complete (either way - ROLLBACK or COMMIT) before the log can be backed up. This is the second reason described in this answer.
5 = Database mirroring

Either a mirror is getting behind or under some latency in a high-performance mirroring situation or mirroring is paused for some reason
6 = Replication

*There can be issues with replication that would cause this - like a log reader agent not running, a database thinking it is marked for replication that no longer is and various other reasons.

You can also see this reason and it is perfectly normal because you are looking at just the right time, just as transactions are being consumed by the log reader*
7 = Database snapshot creation

When creating a database snapshot, you'll see this if you look at just the right moment as a snapshot is being created
8 = Log Scan

I have yet to encounter an issue with this running along forever. If you look long enough and frequently enough you can see this happen, but it shouldn't be a cause of excessive transaction log growth, that I've seen.
9 = An Availability Group secondary replica is applying transaction log records of this database to a corresponding secondary database. About the clearest description yet..

Switching from simple to full has a gotcha

0 = Nothing

What it sounds like.. Shouldn't be waiting.
1 = Checkpoint

Waiting for a checkpoint to occur. This should happen and you should be fine - but there are some cases to look for here for later answers or edits.
2 = Log backup

You are waiting for a log backup to occur. Either you have them scheduled and it will happen soon, or you have the first problem described here and you now know how to fix it.
3 = Active backup or restore

A backup or restore operation is running on the database.
4 = Active transaction

There is an active transaction that needs to complete (either way - ROLLBACK or COMMIT) before the log can be backed up. This is the second reason described in this answer.
5 = Database mirroring

Either a mirror is getting behind or under some latency in a high-performance mirroring situation or mirroring is paused for some reason.
6 = Replication

*There can be issues with replication that would cause this - like a log reader agent not running, a database thinking it is marked for replication that no longer is and various other reasons.

You can also see this reason and it is perfectly normal because you are looking at just the right time, just as transactions are being consumed by the log reader.*
7 = Database snapshot creation

When creating a database snapshot, you'll see this if you look at just the right moment as a snapshot is being created.
8 = Log Scan

I have yet to encounter an issue with this running along forever. If you look long enough and frequently enough you can see this happen, but it shouldn't be a cause of excessive transaction log growth, that I've seen.
9 = An Availability Group secondary replica is applying transaction log records of this database to a corresponding secondary database. About the clearest description yet.

Copy edit; markdown fixes

Source Link

edited Mar 10, 2023 at 9:51

Paul White ♦

96.2k
30
446
701

A Shorter Answer:

You probably either have a long running transaction running (Index maintenance? Big batch delete or update?) or you are in the "default" (more below on what is meant by default) recovery model of Fullfull and have not taken a log backup (or aren't taking them frequently enough).

If it is a recovery model issue, the simple answer could be Switchto switch to the Simplesimple recovery model if you do not need point in time recovery and regular log backups. Many people, though, make that their answer without understanding recovery models. Read on to understand why it matters and then decide what you do. You could also just start taking log backups and stay inwith the Fullfull recovery model.

There could be other reasons, but these are the most common. This answer begins to dive into the most common two reasons and gives you some background information on the why and how behind the reasons as well as explores some other reasons.

A Longer Answer: What Scenarios can cause the log to keep Growing? There are many reasons, but usually these reasons are of the following two patterns: There is a misunderstanding about recovery models or there are long running transactions. Read on for details.

#TopWhat scenarios can cause the log to keep growing? There are many reasons, but usually the reason 1/2is one of the following two patterns: Not Understanding Recovery Models (Being in Full Recovery Model and Not Taking Log Backups - This is the most common reason - the vast majority of those experiencing this issue are.)There is a misunderstanding about recovery models or there are long running transactions. Read on for details.

Top reason 1/2: Not Understanding Recovery Models

(In the full recovery model and not taking log backups - This is the most common reason)

While this answer is not a deep dive ininto SQL Server recovery models, the topic of recovery models is critical to this problem.

In SQL Server, there are three recovery models recovery models:

Full, Full
Bulk-Logged and Bulk-Logged
Simple. Simple

We'll ignore Bulk-Loggedbulk-logged for now we'll. We'll sort of say it is a hybrid model and most people who are in this model are there for a reason and understand recovery models.

The two we care about, and their confusion, are the cause of the majority of the cases of people having this issue are Simplesimple and Fullfull.

Intermission: Recovery in General

##Intermission: Recovery in General BeforeBefore we talk about Recovery Models:recovery models, let's talk about recovery in general. If you want to go even deeper with this topic, just read Paul Randal's blog Paul Randal's blog and as many posts on it as you want. For this question, though:

Crash/Restart Recovery
One

One purpose of the transaction log file is for crash/restart recovery. For the rolling forward and rolling back of work that was either done (rolling forward/redo) before a crash or restart and the work that was started but not finished after a crash or restart (rolling back/undo).

It is the job of the transaction log to see that a transaction started but never finished (rolled back or crash/restart happened before the transaction committed). In that situation It, it is the log's job to say "Hey.. this never really finished, let's roll it back" during recovery. It is also the log's job to see that you did finish something and that your client application was told it was finished (even if it hadn't yet hardened to your data file) and say "Hey.. this really happened, let's roll it forward, let's make it like the applications think it was" after a restart. Now there is more but that is the main purpose.
Point in Time Recovery
The

The other purpose for a transaction log file is to be able to give us the ability to recover to a point in time due to an "oops" in a database or to guarantee a recovery point in the event of a hardware failure involving the data and/or log files of a database.

If this transaction log contains the records of transactions that have been started and finished for recovery, SQL Server can and does then use this information to get a database to where it was before an issue happened. But that isn't always an available option for us. For that to work we have to have our database in the right recovery model, and we have to take log backups.

##Recovery Models Onto the recovery models:

Simple Recovery Model
So with the above introduction, it is easiest to talk about Simple Recovery model first. In this model, you are telling SQL Server: "I am fine with you using your transaction log file for crash and restart recovery..." (You really have no choice there. Look up ACID properties and that should make sense quickly.) "...but once you no longer need it for that crash/restart recovery purpose, go ahead and reuse the log file."

Recovery Models

SQL Server listens to this request in Simple Recovery and it only keepsOnto the information it needs to do crash/restart recovery. Once SQL Server is sure it can recover because data is hardened to the data file (more or less), the data that has been hardened is no longer necessary in the log and is marked for truncation - which means it gets re-used. models:

Full Recovery Model
Simple recovery model
With Full Recovery, you are telling SQL Server that you want to be able to recover to a specific point in time, as long as your log file is available or to a specific point in time that is covered by a log backup. In this case when SQL Server reaches the point where it would be safe to truncate the log file in Simple Recovery Model, it will not do that. Instead It lets the log file continue to grow and will allow it to keep growing, until you take a log backup (or run out of space on your log file drive) under normal circumstances.
With the above introduction, it is easiest to talk about the simple recovery model first. In this model, you are telling SQL Server: "I am fine with you using your transaction log file for crash and restart recovery..." (You really have no choice there. Look up ACID properties and that should make sense quickly.) "...but once you no longer need it for that crash/restart recovery purpose, go ahead and reuse the log file."

SQL Server listens to this request in simple recovery and only keeps the information it needs to do crash/restart recovery. Once SQL Server is sure it can recover because data is hardened to the data file (more or less), the data that has been hardened is no longer necessary in the log and is marked for truncation - which means it gets re-used.

Full recovery model

With full recovery, you are telling SQL Server that you want to be able to recover to a specific point in time, as long as your log file is available or to a specific point in time that is covered by a log backup.

In this case, when SQL Server reaches the point where it would be safe to truncate the log file under the simple recovery model, it will not do that. Instead, it lets the log file continue to grow and will allow it to keep growing, until you take a log backup (or run out of space on your log file drive) under normal circumstances.

###Switching from Simple to Full has a Gotcha. There are rules and exceptions here. We'll talk about long running transactions in depth below.

Switching from simple to gull has a gotcha.

But oneThere are rules and exceptions here. We'll talk about long running transactions in depth below.

One caveat to keep in mind for Full Recovery Modelthe full recovery model is this: If you just switch into Full Recovery modelfull recovery, but never take an initial Full Backupfull backup, SQL Server will not honor your request to be in Full Recoveryuse the full recovery model. Your transaction log will continue to operate as it has in Simpleuntilsimple recovery until you switch to Full Recovery Modelfull recovery AND take your first Full Backupfull backup.

Full Recovery Model without log backups is bad

##Full Recovery Model without log backups is bad.
SoSo, what is the most common reason for uncontrolled log growth? Answer: Being inUsing the Full Recoveryfull recovery model without having any log backups.

###Why is this such a common mistake? Why does it happen all the time? Because each new database gets its initial recovery model setting by looking at the model database.

Why is this such a common mistake?

Model'sWhy does it happen all the time? Because each new database gets its initial recovery model setting is always Full Recovery Model - until and unless someone changes that. So you could sayby looking at the "default Recovery Model" is Full. Many people are not aware of this and have their databases running in Full Recovery Model with no log backups, and therefore a transaction log file much larger than necessarymodel database. This is why it is important to change defaults when they don't work for your organization and its needs)

##Full Recovery Model with too few log backupsModel's initial recovery model setting is badalways full - until and unless someone changes that. You can also get yourself in trouble here by So, you could say the "default recovery model" is full. Many people are not takingaware of this and have their databases running under full recovery with no log backups frequently enough.
Taking a log backup a day may sound fine, it makesand therefore a restore require less restore commands, but keeping in mind the discussion above, thattransaction log file will continuemuch larger than necessary. This is why it is important to growchange defaults when they don't work for your organization and grow until you take log backupsits needs).

Full recovery model with too few log backups is bad

###How do I find out whatYou can also get yourself in trouble here by not taking log backup frequency I need? You need to consider yourbackups frequently enough. Taking a log backup frequency with two thingsa day may sound fine, it makes a restore require fewer restore commands, but keeping in mind: the discussion above, that log file will continue to grow and grow until you take log backups.

How do I find out what log backup frequency I need?

You need to consider your log backup frequency with two things in mind:

Recovery Needs - This should hopefully be first. In the event that the drive housing your transaction log goes bad or you get serious corruption that affects your log backup, how much data can be lost? If that number is no more than 10-15 minutes, then you need to be taking the log backup every 10-15 minute, end of discussion.
Recovery needs

This should hopefully be first. In the event that the drive housing your transaction log fails, or you get serious corruption that affects your log backup, how much data can be lost? If that number is no more than 10-15 minutes, then you need to be taking the log backup every 10-15 minutes, end of discussion.
Log Growth - If your organization is fine to lose more data because of the ability to easily recreate that day you may be fine to have a log backup much less frequently than 15 minutes. Maybe your organization is fine with every 4 hours. But you have to look at how many transactions you generate in 4 hours. Will allowing the log to keep growing in those four hours make too large of a log file? Will that mean your log backups take too long?
Log Growth

If your organization is fine losing more data because of the ability to easily recreate that day you may be fine to have a log backup much less frequently than 15 minutes.

Maybe your organization is fine with every 4 hours. But you have to look at how many transactions you generate in 4 hours. Will allowing the log to keep growing in those four hours make too large of a log file? Will that mean your log backups take too long?

#Top reason 2/2: Long Running Transactions

Top reason 2/2: Long-running transactions

(("My recovery model is fine! The log is still growing!))

This can also be a cause of uncontrolled and unrestrained log growth. No matter the recovery model, but it often comes up as "But I'm in Simple Recovery Modelsimple recovery - why is my log still growing?!"

The reason here is simple: if SQL Server is using thisthe transaction log for recovery purposes as I described above, then it has to see back to the start of a transaction.

This means that a big delete, deleting millions of rows in one delete statement is one transaction and the log cannot do any truncating until that whole delete is done. In Full Recovery Modelfull recovery, this delete is logged and that could be a lot of log records. Same thing with Indexindex optimization work during maintenance windows. It also means that poor transaction management and not watching for and closing open transactions can really hurt you and your log file.

What can I do about these long running transactions?

##What can I do about these long running transactions? YouYou can save yourself here by:

Properly sizing your log file to account for the worst case scenario - like your maintenance or known large operations. And when you grow your log file you should look to this guidance (and the two links she sends you to) by Kimberly Tripp. Right sizing is super critical here.
Properly sizing your log file to account for the worst-case scenario - like your maintenance or known large operations. And when you grow your log file you should look to this guidance (and the two links she sends you to) by Kimberly Tripp. Right sizing is super critical here.
Watching your usage of transactions. Don't start a transaction in your application server and start having long conversations with SQL Server and risk leaving one open too long.
Watching your usage of transactions. Don't start a transaction in your application server and start having long conversations with SQL Server and risk leaving one open too long.
Watching the implied transactions in your DML statements. For example: UPDATE TableName Set Col1 = 'New Value' is a transaction. I didn't put a BEGIN TRAN there and I don't have to, it is still one transaction that just automatically commits when done. So if doing operations on large numbers of rows, consider batching those operations up into more manageable chunks and giving the log time to recover. Or consider the right size to deal with that. Or perhaps look into changing recovery models during a bulk load window.
Watching the auto-commit transactions in your DML statements.

For example: UPDATE TableName Set Col1 = 'New Value' is a transaction. I didn't put a BEGIN TRAN there and I don't have to, it is still one transaction that just automatically commits when done. When doing operations on large numbers of rows, consider batching those operations up into more manageable chunks and giving the log time to recover. Or consider the right size to deal with that. Or perhaps look into changing recovery models during a bulk load window.

#Do these two reasons also apply to Log Shipping?

Do these two reasons also apply to log shipping?

Short answer: yesYes. Longer answer below.

##What is Log Shipping? Log shipping is just what it sounds like - you are shipping your transaction log backups to another server for DR purposes. There is some initialization but after that the process is fairly simple:

What is Log Shipping?

Log shipping is just what it sounds like - you are shipping your transaction log backups to another server for DR purposes. There is some initialization but after that the process is fairly simple:

In some cases, you may only want to do the log shipping restore once a day or every third day or once a week. That is fine. But if you make this change on all of the jobs (including the log backup and copy jobs) that means you are waiting all that time to take a log backup. That means you will have a lot of log growth -- because you are in the full recovery model without log backups -- and it probably also means a large log file to copy across. You should only modify the restore job's schedule and let the log backups and copies happen on a more frequent basis, otherwise you will suffer from the first issue described in this answer.

#General troubleshooting via status codes

General troubleshooting via status codes

By querying the sys.databasessys.databases catalog view you can see information describing the reason your log file may be waiting on truncate/reuse.

0 = Nothing

What it sounds like.. Shouldn't be waiting
1 = Checkpoint

Waiting for a checkpoint to occur. This should happen and you should be fine - but there are some cases to look for here for later answers or edits.
2 = Log backup

You are waiting for a log backup to occur. Either you have them scheduled and it will happen soon, or you have the first problem described here and you now know how to fix it
3 = Active backup or restore

A backup or restore operation is running on the database
4 = Active transaction
There

There is an active transaction that needs to complete (either way - ROLLBACK or COMMIT) before the log can be backed up. This is the second reason described in this answer.
5 = Database mirroring

Either a mirror is getting behind or under some latency in a high performance-performance mirroring situation or mirroring is paused for some reason
6 = Replication
There can be issues with replication that would cause this - like a log reader agent not running, a database thinking it is marked for replication that no longer is and various other reasons. You can also see this reason and it is perfectly normal because you are looking at just the right time, just as transactions are being consumed by the log reader

*There can be issues with replication that would cause this - like a log reader agent not running, a database thinking it is marked for replication that no longer is and various other reasons.

You can also see this reason and it is perfectly normal because you are looking at just the right time, just as transactions are being consumed by the log reader*
7 = Database snapshot creation

You areWhen creating a database snapshot, you'll see this if you look at just the right moment as a snapshot is being created
8 = Log Scan

I have yet to encounter an issue with this running along forever. If you look long enough and frequently enough you can see this happen, but it shouldn't be a cause of excessive transaction log growth, that I've seen.
9 = An AlwaysOn Availability GroupsGroup secondary replica is applying transaction log records of this database to a corresponding secondary database. About the clearest description yet..

A Shorter Answer:

You probably either have a long running transaction running (Index maintenance? Big batch delete or update?) or you are in the "default" (more below on what is meant by default) recovery model of Full and have not taken a log backup (or aren't taking them frequently enough).

If it is a recovery model issue, the simple answer could be Switch to Simple recovery model if you do not need point in time recovery and regular log backups. Many people, though, make that their answer without understanding recovery models. Read on to understand why it matters and then decide what you do. You could also just start taking log backups and stay in Full recovery.

There could be other reasons but these are the most common. This answer begins to dive into the most common two reasons and gives you some background information on the why and how behind the reasons as well as explores some other reasons.

#Top reason 1/2: Not Understanding Recovery Models (Being in Full Recovery Model and Not Taking Log Backups - This is the most common reason - the vast majority of those experiencing this issue are.)

While this answer is not a deep dive in SQL Server recovery models, the topic of recovery models is critical to this problem.

In SQL Server, there are three recovery models:

Full,
Bulk-Logged and
Simple.

We'll ignore Bulk-Logged for now we'll sort of say it is a hybrid model and most people who are in this model are there for a reason and understand recovery models.

The two we care about and their confusion are the cause of the majority of the cases of people having this issue are Simple and Full.

##Intermission: Recovery in General Before we talk about Recovery Models: let's talk about recovery in general. If you want to go even deeper with this topic, just read Paul Randal's blog and as many posts on it as you want. For this question, though:

Crash/Restart Recovery
One purpose of the transaction log file is for crash/restart recovery. For the rolling forward and rolling back of work that was either done (rolling forward/redo) before a crash or restart and the work that was started but not finished after a crash or restart (rolling back/undo). It is the job of the transaction log to see that a transaction started but never finished (rolled back or crash/restart happened before the transaction committed). In that situation It is the log's job to say "Hey.. this never really finished, let's roll it back" during recovery. It is also the log's job to see that you did finish something and that your client application was told it was finished (even if it hadn't yet hardened to your data file) and say "Hey.. this really happened, let's roll it forward, let's make it like the applications think it was" after a restart. Now there is more but that is the main purpose.
Point in Time Recovery
The other purpose for a transaction log file is to be able to give us the ability to recover to a point in time due to an "oops" in a database or to guarantee a recovery point in the event of a hardware failure involving the data and/or log files of a database. If this transaction log contains the records of transactions that have been started and finished for recovery, SQL Server can and does then use this information to get a database to where it was before an issue happened. But that isn't always an available option for us. For that to work we have to have our database in the right recovery model, and we have to take log backups.

##Recovery Models Onto the recovery models:

Simple Recovery Model
So with the above introduction, it is easiest to talk about Simple Recovery model first. In this model, you are telling SQL Server: "I am fine with you using your transaction log file for crash and restart recovery..." (You really have no choice there. Look up ACID properties and that should make sense quickly.) "...but once you no longer need it for that crash/restart recovery purpose, go ahead and reuse the log file."

SQL Server listens to this request in Simple Recovery and it only keeps the information it needs to do crash/restart recovery. Once SQL Server is sure it can recover because data is hardened to the data file (more or less), the data that has been hardened is no longer necessary in the log and is marked for truncation - which means it gets re-used.

Full Recovery Model
With Full Recovery, you are telling SQL Server that you want to be able to recover to a specific point in time, as long as your log file is available or to a specific point in time that is covered by a log backup. In this case when SQL Server reaches the point where it would be safe to truncate the log file in Simple Recovery Model, it will not do that. Instead It lets the log file continue to grow and will allow it to keep growing, until you take a log backup (or run out of space on your log file drive) under normal circumstances.

###Switching from Simple to Full has a Gotcha. There are rules and exceptions here. We'll talk about long running transactions in depth below.

But one caveat to keep in mind for Full Recovery Model is this: If you just switch into Full Recovery model, but never take an initial Full Backup, SQL Server will not honor your request to be in Full Recovery model. Your transaction log will continue to operate as it has in Simpleuntil you switch to Full Recovery Model AND take your first Full Backup.

##Full Recovery Model without log backups is bad.
So, what is the most common reason for uncontrolled log growth? Answer: Being in Full Recovery model without having any log backups.

###Why is this such a common mistake? Why does it happen all the time? Because each new database gets its initial recovery model setting by looking at the model database.

Model's initial recovery model setting is always Full Recovery Model - until and unless someone changes that. So you could say the "default Recovery Model" is Full. Many people are not aware of this and have their databases running in Full Recovery Model with no log backups, and therefore a transaction log file much larger than necessary. This is why it is important to change defaults when they don't work for your organization and its needs)

##Full Recovery Model with too few log backups is bad. You can also get yourself in trouble here by not taking log backups frequently enough.
Taking a log backup a day may sound fine, it makes a restore require less restore commands, but keeping in mind the discussion above, that log file will continue to grow and grow until you take log backups.

###How do I find out what log backup frequency I need? You need to consider your log backup frequency with two things in mind:

Recovery Needs - This should hopefully be first. In the event that the drive housing your transaction log goes bad or you get serious corruption that affects your log backup, how much data can be lost? If that number is no more than 10-15 minutes, then you need to be taking the log backup every 10-15 minute, end of discussion.
Log Growth - If your organization is fine to lose more data because of the ability to easily recreate that day you may be fine to have a log backup much less frequently than 15 minutes. Maybe your organization is fine with every 4 hours. But you have to look at how many transactions you generate in 4 hours. Will allowing the log to keep growing in those four hours make too large of a log file? Will that mean your log backups take too long?

#Top reason 2/2: Long Running Transactions

("My recovery model is fine! The log is still growing!)

This can also be a cause of uncontrolled and unrestrained log growth. No matter the recovery model, but it often comes up as "But I'm in Simple Recovery Model - why is my log still growing?!"

The reason here is simple: if SQL is using this transaction log for recovery purposes as I described above, then it has to see back to the start of a transaction.

This means that a big delete, deleting millions of rows in one delete statement is one transaction and the log cannot do any truncating until that whole delete is done. In Full Recovery Model, this delete is logged and that could be a lot of log records. Same thing with Index optimization work during maintenance windows. It also means that poor transaction management and not watching for and closing open transactions can really hurt you and your log file.

##What can I do about these long running transactions? You can save yourself here by:

Properly sizing your log file to account for the worst case scenario - like your maintenance or known large operations. And when you grow your log file you should look to this guidance (and the two links she sends you to) by Kimberly Tripp. Right sizing is super critical here.
Watching your usage of transactions. Don't start a transaction in your application server and start having long conversations with SQL Server and risk leaving one open too long.
Watching the implied transactions in your DML statements. For example: UPDATE TableName Set Col1 = 'New Value' is a transaction. I didn't put a BEGIN TRAN there and I don't have to, it is still one transaction that just automatically commits when done. So if doing operations on large numbers of rows, consider batching those operations up into more manageable chunks and giving the log time to recover. Or consider the right size to deal with that. Or perhaps look into changing recovery models during a bulk load window.

#Do these two reasons also apply to Log Shipping?

Short answer: yes. Longer answer below.

In some cases, you may only want to do the log shipping restore once a day or every third day or once a week. That is fine. But if you make this change on all of the jobs (including the log backup and copy jobs) that means you are waiting all that time to take a log backup. That means you will have a lot of log growth -- because you are in full recovery model without log backups -- and it probably also means a large log file to copy across. You should only modify the restore job's schedule and let the log backups and copies happen on a more frequent basis, otherwise you will suffer from the first issue described in this answer.

#General troubleshooting via status codes

By querying the sys.databases catalog view you can see information describing the reason your log file may be waiting on truncate/reuse.

0 = Nothing
What it sounds like.. Shouldn't be waiting
1 = Checkpoint
Waiting for a checkpoint to occur. This should happen and you should be fine - but there are some cases to look for here for later answers or edits.
2 = Log backup
You are waiting for a log backup to occur. Either you have them scheduled and it will happen soon, or you have the first problem described here and you now know how to fix it
3 = Active backup or restore
A backup or restore operation is running on the database
4 = Active transaction
There is an active transaction that needs to complete (either way - ROLLBACK or COMMIT) before the log can be backed up. This is the second reason described in this answer.
5 = Database mirroring
Either a mirror is getting behind or under some latency in a high performance mirroring situation or mirroring is paused for some reason
6 = Replication
There can be issues with replication that would cause this - like a log reader agent not running, a database thinking it is marked for replication that no longer is and various other reasons. You can also see this reason and it is perfectly normal because you are looking at just the right time, just as transactions are being consumed by the log reader
7 = Database snapshot creation
You are creating a database snapshot, you'll see this if you look at just the right moment as a snapshot is being created
8 = Log Scan
I have yet to encounter an issue with this running along forever. If you look long enough and frequently enough you can see this happen, but it shouldn't be a cause of excessive transaction log growth, that I've seen.
9 = An AlwaysOn Availability Groups secondary replica is applying transaction log records of this database to a corresponding secondary database. About the clearest description yet..

A Shorter Answer

You probably either have a long running transaction running (Index maintenance? Big batch delete or update?) or you are in the "default" (more below on what is meant by default) recovery model of full and have not taken a log backup (or aren't taking them frequently enough).

If it is a recovery model issue, the simple answer could be to switch to the simple recovery model if you do not need point in time recovery and regular log backups. Many people, though, make that their answer without understanding recovery models. Read on to understand why it matters and then decide what you do. You could also just start taking log backups and stay with the full recovery model.

A Longer Answer

What scenarios can cause the log to keep growing? There are many reasons, but usually the reason is one of the following two patterns: There is a misunderstanding about recovery models or there are long running transactions. Read on for details.

Top reason 1/2: Not Understanding Recovery Models

(In the full recovery model and not taking log backups - This is the most common reason)

While this answer is not a deep dive into SQL Server recovery models, the topic of recovery models is critical to this problem.

In SQL Server, there are three recovery models:

Full
Bulk-Logged
Simple

We'll ignore bulk-logged for now. We'll sort of say it is a hybrid model and most people who are in this model are there for a reason and understand recovery models.

The two we care about, and their confusion, are the cause of the majority of the cases of people having this issue are simple and full.

Intermission: Recovery in General

Before we talk about recovery models, let's talk about recovery in general. If you want to go even deeper with this topic, just read Paul Randal's blog and as many posts on it as you want. For this question, though:

Crash/Restart Recovery

One purpose of the transaction log file is for crash/restart recovery. For the rolling forward and rolling back of work that was either done (rolling forward/redo) before a crash or restart and the work that was started but not finished after a crash or restart (rolling back/undo).

It is the job of the transaction log to see that a transaction started but never finished (rolled back or crash/restart happened before the transaction committed). In that situation, it is the log's job to say "Hey.. this never really finished, let's roll it back" during recovery. It is also the log's job to see that you did finish something and that your client application was told it was finished (even if it hadn't yet hardened to your data file) and say "Hey.. this really happened, let's roll it forward, let's make it like the applications think it was" after a restart. Now there is more but that is the main purpose.
Point in Time Recovery

The other purpose for a transaction log file is to give us the ability to recover to a point in time due to an "oops" in a database or to guarantee a recovery point in the event of a hardware failure involving the data and/or log files of a database.

If this transaction log contains the records of transactions that have been started and finished for recovery, SQL Server can and does then use this information to get a database to where it was before an issue happened. But that isn't always an available option for us. For that to work we have to have our database in the right recovery model, and we have to take log backups.

Recovery Models

Onto the recovery models:

Simple recovery model

With the above introduction, it is easiest to talk about the simple recovery model first. In this model, you are telling SQL Server: "I am fine with you using your transaction log file for crash and restart recovery..." (You really have no choice there. Look up ACID properties and that should make sense quickly.) "...but once you no longer need it for that crash/restart recovery purpose, go ahead and reuse the log file."

SQL Server listens to this request in simple recovery and only keeps the information it needs to do crash/restart recovery. Once SQL Server is sure it can recover because data is hardened to the data file (more or less), the data that has been hardened is no longer necessary in the log and is marked for truncation - which means it gets re-used.

Full recovery model

With full recovery, you are telling SQL Server that you want to be able to recover to a specific point in time, as long as your log file is available or to a specific point in time that is covered by a log backup.

In this case, when SQL Server reaches the point where it would be safe to truncate the log file under the simple recovery model, it will not do that. Instead, it lets the log file continue to grow and will allow it to keep growing, until you take a log backup (or run out of space on your log file drive) under normal circumstances.

Switching from simple to gull has a gotcha.

There are rules and exceptions here. We'll talk about long running transactions in depth below.

One caveat to keep in mind for the full recovery model is this: If you just switch into full recovery, but never take an initial full backup, SQL Server will not honor your request to use the full recovery model. Your transaction log will continue to operate as it has in simple recovery until you switch to full recovery AND take your first full backup.

Full Recovery Model without log backups is bad

So, what is the most common reason for uncontrolled log growth? Answer: Using the full recovery model without having any log backups.

Why is this such a common mistake?

Why does it happen all the time? Because each new database gets its initial recovery model setting by looking at the model database.

Model's initial recovery model setting is always full - until and unless someone changes that. So, you could say the "default recovery model" is full. Many people are not aware of this and have their databases running under full recovery with no log backups, and therefore a transaction log file much larger than necessary. This is why it is important to change defaults when they don't work for your organization and its needs).

Full recovery model with too few log backups is bad

You can also get yourself in trouble here by not taking log backups frequently enough. Taking a log backup a day may sound fine, it makes a restore require fewer restore commands, but keeping in mind the discussion above, that log file will continue to grow and grow until you take log backups.

How do I find out what log backup frequency I need?

You need to consider your log backup frequency with two things in mind:

Recovery needs

This should hopefully be first. In the event that the drive housing your transaction log fails, or you get serious corruption that affects your log backup, how much data can be lost? If that number is no more than 10-15 minutes, then you need to be taking the log backup every 10-15 minutes, end of discussion.
Log Growth

If your organization is fine losing more data because of the ability to easily recreate that day you may be fine to have a log backup much less frequently than 15 minutes.

Maybe your organization is fine with every 4 hours. But you have to look at how many transactions you generate in 4 hours. Will allowing the log to keep growing in those four hours make too large of a log file? Will that mean your log backups take too long?

Top reason 2/2: Long-running transactions

("My recovery model is fine! The log is still growing!)

This can also be a cause of uncontrolled and unrestrained log growth. No matter the recovery model, but it often comes up as "But I'm in simple recovery - why is my log still growing?!"

The reason here is simple: if SQL Server is using the transaction log for recovery purposes as I described above, then it has to see back to the start of a transaction.

This means that a big delete, deleting millions of rows in one delete statement is one transaction and the log cannot do any truncating until that whole delete is done. In full recovery, that could be a lot of log records. Same thing with index optimization work during maintenance windows. It also means that poor transaction management and not watching for and closing open transactions can really hurt you and your log file.

What can I do about these long running transactions?

You can save yourself here by:

Properly sizing your log file to account for the worst-case scenario - like your maintenance or known large operations. And when you grow your log file you should look to this guidance (and the two links she sends you to) by Kimberly Tripp. Right sizing is super critical here.
Watching your usage of transactions. Don't start a transaction in your application server and start having long conversations with SQL Server and risk leaving one open too long.
Watching the auto-commit transactions in your DML statements.

For example: UPDATE TableName Set Col1 = 'New Value' is a transaction. I didn't put a BEGIN TRAN there and I don't have to, it is still one transaction that just automatically commits when done. When doing operations on large numbers of rows, consider batching those operations up into more manageable chunks and giving the log time to recover. Or consider the right size to deal with that. Or perhaps look into changing recovery models during a bulk load window.

Do these two reasons also apply to log shipping?

Short answer: Yes. Longer answer below.

What is Log Shipping?

Log shipping is just what it sounds like - you are shipping your transaction log backups to another server for DR purposes. There is some initialization but after that the process is fairly simple:

General troubleshooting via status codes

By querying the sys.databases catalog view you can see information describing the reason your log file may be waiting on truncate/reuse.

0 = Nothing

What it sounds like.. Shouldn't be waiting
1 = Checkpoint

Waiting for a checkpoint to occur. This should happen and you should be fine - but there are some cases to look for here for later answers or edits.
2 = Log backup

You are waiting for a log backup to occur. Either you have them scheduled and it will happen soon, or you have the first problem described here and you now know how to fix it
3 = Active backup or restore

A backup or restore operation is running on the database
4 = Active transaction

There is an active transaction that needs to complete (either way - ROLLBACK or COMMIT) before the log can be backed up. This is the second reason described in this answer.
5 = Database mirroring

Either a mirror is getting behind or under some latency in a high-performance mirroring situation or mirroring is paused for some reason
6 = Replication

*There can be issues with replication that would cause this - like a log reader agent not running, a database thinking it is marked for replication that no longer is and various other reasons.

You can also see this reason and it is perfectly normal because you are looking at just the right time, just as transactions are being consumed by the log reader*
7 = Database snapshot creation

When creating a database snapshot, you'll see this if you look at just the right moment as a snapshot is being created
8 = Log Scan

I have yet to encounter an issue with this running along forever. If you look long enough and frequently enough you can see this happen, but it shouldn't be a cause of excessive transaction log growth, that I've seen.
9 = An Availability Group secondary replica is applying transaction log records of this database to a corresponding secondary database. About the clearest description yet..

added 7 characters in body

Source Link

edited Nov 14, 2022 at 13:19

J.D.

41.1k
12
64
145

You probably either have a long running transaction running (Index maintenance? Big batch delete or update?) or you are in the "default" (more below on what is meant by default) recovery modemodel of Full and have not taken a log backup (or aren't taking them frequently enough).

If it is a recovery model issue, the simple answer could be Switch to Simple recovery modemodel if you do not need point in time recovery and regular log backups. Many people, though, make that their answer without understanding recovery models. Read on to understand why it matters and then decide what you do. You could also just start taking log backups and stay in Full recovery.

#Top reason 1/2: Not Understanding Recovery Models (Being in Full Recovery ModeModel and Not Taking Log Backups - This is the most common reason - the vast majority of those experiencing this issue are.)

But one caveat to keep in mind for Full Recovery ModeModel is this: If you just switch into Full Recovery modemodel, but never take an initial Full Backup, SQL Server will not honor your request to be in Full Recovery model. Your transaction log will continue to operate as it has in Simpleuntil you switch to Full Recovery Model AND take your first Full Backup.

##Full Recovery Model without log backups is bad.
So, what is the most common reason for uncontrolled log growth? Answer: Being in Full Recovery modemodel without having any log backups.

In some cases, you may only want to do the log shipping restore once a day or every third day or once a week. That is fine. But if you make this change on all of the jobs (including the log backup and copy jobs) that means you are waiting all that time to take a log backup. That means you will have a lot of log growth -- because you are in full recovery modemodel without log backups -- and it probably also means a large log file to copy across. You should only modify the restore job's schedule and let the log backups and copies happen on a more frequent basis, otherwise you will suffer from the first issue described in this answer.

You probably either have a long running transaction running (Index maintenance? Big batch delete or update?) or you are in the "default" (more below on what is meant by default) recovery mode of Full and have not taken a log backup (or aren't taking them frequently enough).

If it is a recovery model issue, the simple answer could be Switch to Simple recovery mode if you do not need point in time recovery and regular log backups. Many people, though, make that their answer without understanding recovery models. Read on to understand why it matters and then decide what you do. You could also just start taking log backups and stay in Full recovery.

#Top reason 1/2: Not Understanding Recovery Models (Being in Full Recovery Mode and Not Taking Log Backups - This is the most common reason - the vast majority of those experiencing this issue are.)

But one caveat to keep in mind for Full Recovery Mode is this: If you just switch into Full Recovery mode, but never take an initial Full Backup, SQL Server will not honor your request to be in Full Recovery model. Your transaction log will continue to operate as it has in Simpleuntil you switch to Full Recovery Model AND take your first Full Backup.

##Full Recovery Model without log backups is bad.
So, what is the most common reason for uncontrolled log growth? Answer: Being in Full Recovery mode without having any log backups.

In some cases, you may only want to do the log shipping restore once a day or every third day or once a week. That is fine. But if you make this change on all of the jobs (including the log backup and copy jobs) that means you are waiting all that time to take a log backup. That means you will have a lot of log growth -- because you are in full recovery mode without log backups -- and it probably also means a large log file to copy across. You should only modify the restore job's schedule and let the log backups and copies happen on a more frequent basis, otherwise you will suffer from the first issue described in this answer.

You probably either have a long running transaction running (Index maintenance? Big batch delete or update?) or you are in the "default" (more below on what is meant by default) recovery model of Full and have not taken a log backup (or aren't taking them frequently enough).

##Full Recovery Model without log backups is bad.
So, what is the most common reason for uncontrolled log growth? Answer: Being in Full Recovery model without having any log backups.

In some cases, you may only want to do the log shipping restore once a day or every third day or once a week. That is fine. But if you make this change on all of the jobs (including the log backup and copy jobs) that means you are waiting all that time to take a log backup. That means you will have a lot of log growth -- because you are in full recovery model without log backups -- and it probably also means a large log file to copy across. You should only modify the restore job's schedule and let the log backups and copies happen on a more frequent basis, otherwise you will suffer from the first issue described in this answer.