Skip to main content
3 of 3
added 2091 characters in body

transparently splitting a single table so the program doesn't notice

I am using homeassistant with mariadb. This is hosted on a Pi 4 with 8gb of RAM I have been collecting data a long time, and a single table grows to 20gb. The storage is a M.2 NVMe via USB 3. CPU usage is normal less than 20 but climbs to 40%. A few times a day it briefly spike. Memory usage has never been above 2gb with more than 6gb free.

Right now 350 (approx) sensors periodically record data to a single table.

Issues to solve:

  1. performance gets slower over time
  2. Too much data in table creates single point of failure. All my data can be gone with a single error.
  3. Backups are slow and errors occur because the entire table is locked during a backup and new data can't be added to the table when it locked.

Solution I can up with

  1. Split table by date (1 table per month)
  2. Split table by sensor number. (Each sensor has its own table)
  3. Ask community for alternative solutions.

Regarding solution 1

All sensors data is combined so even if I only ask for 12 sensors it has to sift through 350 worth of data. When I add senors, the table will continue to grow. Will it grow to be a problem? hope not.

Regarding solution 2

Will having 350 tables (maybe 500 in the future) cause performance issues of its own even though most table are less frequently used.

Implementation

I envision creating a view and/or trigger,procedures. I would create a view with the existing table name and use that to split the data.

In either solution 1 or 2 how would I identify and query only the necessary tables transparently using a view or etc. When homeassistant executes a query it should return all relevant data as if all the data was in a single table.

Using solution 1, The date would have to be extracted so that if the data range exceed a single month the from would have to be altered to include all relevant months. Instead of select * from table where data is between ... I would need that but then the from would be from 062023,052023,042023 (or whatever is relevant)

Using solution 2, For example if homeassistant asks select * from table where sensors =1,2, 3,5,9,11 then query would need to be altered to select * from 1,2,3,5,9,11;

Would using a view allow me to make the necessary changes transparently?

Is there another solution I had overlooked.

CREATE TABLE `states` ( `state_id` bigint(20) NOT NULL AUTO_INCREMENT, `domain` varchar(64) DEFAULT NULL, `entity_id` char(0) DEFAULT NULL, `state` varchar(255) DEFAULT NULL, `attributes` char(0) DEFAULT NULL, `event_id` bigint(20) DEFAULT NULL, `last_changed` char(0) DEFAULT NULL, `last_changed_ts` double DEFAULT NULL, `last_updated` char(0) DEFAULT NULL, `created` datetime(6) DEFAULT NULL, `last_updated_ts` double DEFAULT NULL, `old_state_id` bigint(20) DEFAULT NULL, `attributes_id` bigint(20) DEFAULT NULL, `context_id` char(0) DEFAULT NULL, `context_user_id` char(0) DEFAULT NULL, `context_parent_id` char(0) DEFAULT NULL, `origin_idx` smallint(6) DEFAULT NULL, `context_id_bin` tinyblob DEFAULT NULL, `context_user_id_bin` tinyblob DEFAULT NULL, `context_parent_id_bin` tinyblob DEFAULT NULL, `metadata_id` bigint(20) DEFAULT NULL, `last_reported_ts` double DEFAULT NULL, PRIMARY KEY (`state_id`), KEY `ix_states_last_updated_ts` (`last_updated_ts`), KEY `ix_states_context_id_bin` (`context_id_bin`(16)), KEY `ix_states_attributes_id` (`attributes_id`), KEY `ix_states_old_state_id` (`old_state_id`), KEY `ix_states_metadata_id_last_updated_ts` (`metadata_id`,`last_updated_ts`), CONSTRAINT `states_ibfk_1` FOREIGN KEY (`old_state_id`) REFERENCES `states` (` state_id`), CONSTRAINT `states_ibfk_2` FOREIGN KEY (`attributes_id`) REFERENCES `state_att ributes` (`attributes_id`), CONSTRAINT `states_ibfk_3` FOREIGN KEY (`metadata_id`) REFERENCES `states_meta ` (`metadata_id`) ) ENGINE=InnoDB AUTO_INCREMENT=373236082 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4 _unicode_ci;