My daily driver (Debian Bookworm RC3 + KDE Plasma) is configured to emailsend me with errorsemails containing error notifications.
Today, I got thisreceived the following email today:
Stack Exchange network consists of 183 Q&A communities including Stack Overflow, the largest, most trusted online community for developers to learn, share their knowledge, and build their careers.
Visit Stack ExchangeStack Internal
Knowledge at work
Bring the best of human thought and AI automation together at your work.
Explore Stack InternalMy daily driver (Debian Bookworm RC3 + KDE Plasma) is configured to emailsend me with errorsemails containing error notifications.
Today, I got thisreceived the following email today:
My daily driver (Debian Bookworm RC3 + KDE Plasma) is configured to email me with errors.
I got this email today:
My daily driver (Debian Bookworm RC3 + KDE Plasma) is configured to send me emails containing error notifications.
Today, I received the following email:
My daily driver (Debian Bookworm RC3 + KDE Plasma) is configured to email me with errors.
I got this email today:
This message was generated by the smartd daemon running on: host name: desk DNS domain: local.lan The following warning/error was logged by the smartd daemon: Device: /dev/nvme0, number of Error Log entries increased from 1754 to 1758 Device info: KBG30ZMV256G TOSHIBA, S/N:X8OPD1PGP12P, FW:ADHA0101 For details see host's SYSLOG. You can also use the smartctl utility for further investigation. The original message about this issue was sent at Wed May 17 16:09:04 2023 EDT Another message will be sent in 24 hours if the problem persists. This is what sudo journalctl -t smart shows:
May 20 15:19:47 desk smartd[550]: smartd 7.3 2022-02-28 r5338 [x86_64-linux-6.1.0-9-amd64] (local build) May 20 15:19:47 desk smartd[550]: Copyright (C) 2002-22, Bruce Allen, Christian Franke, www.smartmontools.org May 20 15:19:47 desk smartd[550]: Opened configuration file /etc/smartd.conf May 20 15:19:47 desk smartd[550]: Drive: DEVICESCAN, implied '-a' Directive on line 21 of file /etc/smartd.conf May 20 15:19:47 desk smartd[550]: Configuration file /etc/smartd.conf was parsed, found DEVICESCAN, scanning devices May 20 15:19:47 desk smartd[550]: Device: /dev/sda, type changed from 'scsi' to 'sat' May 20 15:19:47 desk smartd[550]: Device: /dev/sda [SAT], opened May 20 15:19:47 desk smartd[550]: Device: /dev/sda [SAT], CT4000MX500SSD1, S/N:2304E6A3D318, WWN:5-00a075-1e6a3d318, FW:M3CR045, 4.00 TB May 20 15:19:47 desk smartd[550]: Device: /dev/sda [SAT], not found in smartd database 7.3/5319. May 20 15:19:47 desk smartd[550]: Device: /dev/sda [SAT], is SMART capable. Adding to "monitor" list. May 20 15:19:47 desk smartd[550]: Device: /dev/sda [SAT], state read from /var/lib/smartmontools/smartd.CT4000MX500SSD1-2304E6A3D318.ata.state May 20 15:19:47 desk smartd[550]: Device: /dev/nvme0, opened May 20 15:19:47 desk smartd[550]: Device: /dev/nvme0, KBG30ZMV256G TOSHIBA, S/N:X8OPD1PGP12P, FW:ADHA0101 May 20 15:19:47 desk smartd[550]: Device: /dev/nvme0, is SMART capable. Adding to "monitor" list. May 20 15:19:47 desk smartd[550]: Device: /dev/nvme0, state read from /var/lib/smartmontools/smartd.KBG30ZMV256G_TOSHIBA-X8OPD1PGP12P.nvme.state May 20 15:19:47 desk smartd[550]: Monitoring 1 ATA/SATA, 0 SCSI/SAS and 1 NVMe devices May 20 15:19:48 desk smartd[550]: Device: /dev/nvme0, number of Error Log entries increased from 1754 to 1758 May 20 15:19:48 desk smartd[550]: Sending warning via /usr/share/smartmontools/smartd-runner to root ... May 20 15:19:48 desk smartd[550]: Warning via /usr/share/smartmontools/smartd-runner to root: successful May 20 15:19:48 desk smartd[550]: Device: /dev/sda [SAT], state written to /var/lib/smartmontools/smartd.CT4000MX500SSD1-2304E6A3D318.ata.state May 20 15:19:48 desk smartd[550]: Device: /dev/nvme0, state written to /var/lib/smartmontools/smartd.KBG30ZMV256G_TOSHIBA-X8OPD1PGP12P.nvme.state May 20 15:49:48 desk smartd[550]: Device: /dev/sda [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 73 to 74 May 20 22:49:48 desk smartd[550]: Device: /dev/nvme0, number of Error Log entries increased from 1758 to 1760 When I run sudo smartctl -i -a /dev/nvme0, it shows me the error count, but I can't figure out how to see the log message associated to the increase count:
smartctl 7.3 2022-02-28 r5338 [x86_64-linux-6.1.0-9-amd64] (local build) Copyright (C) 2002-22, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Model Number: KBG30ZMV256G TOSHIBA Serial Number: X8OPD1PGP12P Firmware Version: ADHA0101 PCI Vendor/Subsystem ID: 0x1179 IEEE OUI Identifier: 0x00080d Controller ID: 0 NVMe Version: 1.2.1 Number of Namespaces: 1 Namespace 1 Size/Capacity: 256,060,514,304 [256 GB] Namespace 1 Formatted LBA Size: 512 Namespace 1 IEEE EUI-64: 00080d 04004ad9aa Local Time is: Sat May 20 23:09:32 2023 EDT Firmware Updates (0x12): 1 Slot, no Reset required Optional Admin Commands (0x0017): Security Format Frmw_DL Self_Test Optional NVM Commands (0x0017): Comp Wr_Unc DS_Mngmt Sav/Sel_Feat Log Page Attributes (0x02): Cmd_Eff_Lg Maximum Data Transfer Size: 512 Pages Warning Comp. Temp. Threshold: 82 Celsius Critical Comp. Temp. Threshold: 85 Celsius Supported Power States St Op Max Active Idle RL RT WL WT Ent_Lat Ex_Lat 0 + 3.30W - - 0 0 0 0 0 0 1 + 2.70W - - 1 1 1 1 0 0 2 + 2.30W - - 2 2 2 2 0 0 3 - 0.0500W - - 4 4 4 4 8000 32000 4 - 0.0050W - - 4 4 4 4 8000 40000 Supported LBA Sizes (NSID 0x1) Id Fmt Data Metadt Rel_Perf 0 - 4096 0 0 1 + 512 0 3 === START OF SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED SMART/Health Information (NVMe Log 0x02) Critical Warning: 0x00 Temperature: 32 Celsius Available Spare: 100% Available Spare Threshold: 10% Percentage Used: 30% Data Units Read: 23,188,612 [11.8 TB] Data Units Written: 39,727,036 [20.3 TB] Host Read Commands: 222,771,983 Host Write Commands: 498,052,687 Controller Busy Time: 7,440 Power Cycles: 291 Power On Hours: 20,378 Unsafe Shutdowns: 615 Media and Data Integrity Errors: 0 Error Information Log Entries: 1,760 Warning Comp. Temperature Time: 0 Critical Comp. Temperature Time: 0 Temperature Sensor 1: 32 Celsius Error Information (NVMe Log 0x01, 16 of 64 entries) Num ErrCount SQId CmdId Status PELoc LBA NSID VS 0 1760 0 0x501a 0xc005 0x028 - 1 - 1 1759 0 0xb012 0xc005 0x028 - 1 - 2 1758 0 0x5010 0xc005 0x028 - 0 - How can I figure out what the errors are?