6

Is the command like the one below enough to periodically check the health of a disk?

smartctl -H /dev/sda 

The output contains a line like the one below.

SMART overall-health self-assessment test result: PASSED 

But what is a "self-assessment test"? Is that result up-to-date at the time I execute the command?

In other words, should I rather ask for an extended test with "-t short" or "-t long" and then retrieve the needed information with "smartctl -l selftest /dev/sda"?

All I need is knowing when an hard disk has to be replaced without going to check a LED on the cabinet of the physical machine.

I tried to have a look at the documentation of "smartmontools", but I'm still lost there and I feel under pressure.

Thanks in advance!

Andrea

1 Answer 1

5

You may take a fast look at the examples section of the man page.

SMART does test on the fly, not directly visible to the operating system, when ordered to do so - otherwise, SMART will just do some basic tests. But the overall default configuration is dependent on what the manufacturer of your hard drive / solid state drive / tape drive has implemented.

Keep in mind, when you order SMART to do some self test, this will use internal I/O operations of the drive. Meaning, if you order your drive to do a long self test, your drives performances will be slower than usual, cause your drive is performing the physical self test, checking block after block...

So long-self-tests should not be done every time, as they reduce performance of your drive, while test is going on.

Commands you issue to your SMART capable drives via smartctl are valid until otherwise commanded or after full system power off (meaning, issued commands might survive a soft reset of the machine).

I recommend using default auto settings of your drives.

At system boot, you once tell your drives to start periodicly scans in an automated way, you can do this via:

smartctl --smart=on --offlineauto=on --saveauto=on /dev/sda 

This will start to perform several, regular and periodic tests, in a performances friendly fashion for drive /dev/sda in background (not directly visible to the operating system).

Then you can check the health of your drive using the command smartctl -H /dev/sda, or smartctl --health /dev/sda. If this command returns an error, this usually means that either the drive is already in failure or the drive is predicting to go into failure within the next 24 hours. So performing a periodic check of at least every 12 hours is recommended. When a failure has been found, you can use smartctl -a /dev/sda or smartctl --all /dev/sda for more info. Usual, you don't need to check more often than for every 4 hours. As the automated scans probably will only run every 4 hours (but this depends on the manufacturers default settings).

PS Linux machines usually have already an automated SMART check enabled, using a daemon (service), often just named as smartd.service. You can check it's status via systemctl status smartd.service and/or watch its logs via journalctl --unit smartd.service

13
  • Thanks, @paladin! Though I still cannot tell what is the bare minimum I should do to have the information equivalent to the information of a lit-up failure LED, in case there's a failure. You wrote "SMART does test on the fly, not directly visible to the operating system, when ordered to do so - otherwise, SMART will just do some basic tests". Would the basic tests be enough for my aim? If that is the case how can I ask for basic tests? A command like "smartctl -H /dev/sda" would be enough or it is just to know the result of some test requested beforehand? Commented Jan 31, 2024 at 17:50
  • You wrote "As the automated scans probably will only run every 4 hours (but this depends on the manufacturers default settings)". I'm confused. I cannot even tell whether scans are performed necessarily as a consequence of an explicit request with a command or, instead, the hardware will refresh the information anyway, and, if that is the case, with what frequency - where does that "4 hours" come from? Commented Jan 31, 2024 at 17:50
  • It's the default setting for most SMART drives. How about you just try the command smartctl --smart=on --offlineauto=on --saveauto=on /dev/sda, you'll see that it will tell you that it will run scans every 4 hours. Commented Jan 31, 2024 at 20:11
  • The basic test is just some kind of simple power on self test + runtime tests. Also the drive will report problems which it will find while usual operations. You can just use smartctl -H /dev/sda, but this will only report currently found problems. Your drive could have bad sectors, where you don't know that it has bad sectors, cause you are not using the sectors, that's why you should perform regular tests to check also the unused sectors. Most errors on a drive are being detected by trying to read the bad sectors. If you are not reading, you cannot detect bad sectors. Commented Jan 31, 2024 at 20:15
  • 1
    @laur On Debian you don't need to do this. The OS is already doing it. You can check this with the smartctl --health command. Usually, most big Linux distributions do it already. Those commands are usually only needed on bare minimum systems. Drive failures will usually also reported in the kernel log, usually watchable using dmesg. Commented Oct 21 at 15:40

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.