CREATE TABLE `states` ( `state_id` bigint(20) NOT NULL AUTO_INCREMENT, `domain` varchar(64) DEFAULT NULL, `entity_id` char(0) DEFAULT NULL, `state` varchar(255) DEFAULT NULL, `attributes` char(0) DEFAULT NULL, `event_id` bigint(20) DEFAULT NULL, `last_changed` char(0) DEFAULT NULL, `last_changed_ts` double DEFAULT NULL, `last_updated` char(0) DEFAULT NULL, `created` datetime(6) DEFAULT NULL, `last_updated_ts` double DEFAULT NULL, `old_state_id` bigint(20) DEFAULT NULL, `attributes_id` bigint(20) DEFAULT NULL, `context_id` char(0) DEFAULT NULL, `context_user_id` char(0) DEFAULT NULL, `context_parent_id` char(0) DEFAULT NULL, `origin_idx` smallint(6) DEFAULT NULL, `context_id_bin` tinyblob DEFAULT NULL, `context_user_id_bin` tinyblob DEFAULT NULL, `context_parent_id_bin` tinyblob DEFAULT NULL, `metadata_id` bigint(20) DEFAULT NULL, `last_reported_ts` double DEFAULT NULL, PRIMARY KEY (`state_id`), KEY `ix_states_last_updated_ts` (`last_updated_ts`), KEY `ix_states_context_id_bin` (`context_id_bin`(16)), KEY `ix_states_attributes_id` (`attributes_id`), KEY `ix_states_old_state_id` (`old_state_id`), KEY `ix_states_metadata_id_last_updated_ts` (`metadata_id`,`last_updated_ts`), CONSTRAINT `states_ibfk_1` FOREIGN KEY (`old_state_id`) REFERENCES `states` (` state_id`), CONSTRAINT `states_ibfk_2` FOREIGN KEY (`attributes_id`) REFERENCES `state_att ributes` (`attributes_id`), CONSTRAINT `states_ibfk_3` FOREIGN KEY (`metadata_id`) REFERENCES `states_meta ` (`metadata_id`) ) ENGINE=InnoDB AUTO_INCREMENT=373236082 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4 _unicode_ci; CREATE TABLE `states` ( `state_id` bigint(20) NOT NULL AUTO_INCREMENT, `domain` varchar(64) DEFAULT NULL, `entity_id` char(0) DEFAULT NULL, `state` varchar(255) DEFAULT NULL, `attributes` char(0) DEFAULT NULL, `event_id` bigint(20) DEFAULT NULL, `last_changed` char(0) DEFAULT NULL, `last_changed_ts` double DEFAULT NULL, `last_updated` char(0) DEFAULT NULL, `created` datetime(6) DEFAULT NULL, `last_updated_ts` double DEFAULT NULL, `old_state_id` bigint(20) DEFAULT NULL, `attributes_id` bigint(20) DEFAULT NULL, `context_id` char(0) DEFAULT NULL, `context_user_id` char(0) DEFAULT NULL, `context_parent_id` char(0) DEFAULT NULL, `origin_idx` smallint(6) DEFAULT NULL, `context_id_bin` tinyblob DEFAULT NULL, `context_user_id_bin` tinyblob DEFAULT NULL, `context_parent_id_bin` tinyblob DEFAULT NULL, `metadata_id` bigint(20) DEFAULT NULL, `last_reported_ts` double DEFAULT NULL, PRIMARY KEY (`state_id`), KEY `ix_states_last_updated_ts` (`last_updated_ts`), KEY `ix_states_context_id_bin` (`context_id_bin`(16)), KEY `ix_states_attributes_id` (`attributes_id`), KEY `ix_states_old_state_id` (`old_state_id`), KEY `ix_states_metadata_id_last_updated_ts` (`metadata_id`,`last_updated_ts`), CONSTRAINT `states_ibfk_1` FOREIGN KEY (`old_state_id`) REFERENCES `states` (` state_id`), CONSTRAINT `states_ibfk_2` FOREIGN KEY (`attributes_id`) REFERENCES `state_att ributes` (`attributes_id`), CONSTRAINT `states_ibfk_3` FOREIGN KEY (`metadata_id`) REFERENCES `states_meta ` (`metadata_id`) ) ENGINE=InnoDB AUTO_INCREMENT=373236082 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4 _unicode_ci; I am using homeassistant with mariadb. This is hosted on a Pi 4 with 8gb of RAM I have been collecting data a long time, and a single table grows to 20gb. The storage is a M.2 NVMe via USB 3. CPU usage is normal less than 20 but climbs to 40%. A few times a day it briefly spike. Memory usage has never been above 2gb with more than 6gb free.
Right now 350 (approx) sensors periodically record data to a single table.
Issues to solve:
- performance gets slower over time
- Too much data in table creates single point of failure. All my data can be gone with a single error.
- Backups are slow and errors occur because the entire table is locked during a backup and new data can't be added to the table when it locked.
Solution I can up with
- Split table by date (1 table per month)
- Split table by sensor number. (Each sensor has its own table)
- Ask community for alternative solutions.
Regarding solution 1
All sensors data is combined so even if I only ask for 12 sensors it has to sift through 350 worth of data. When I add senors, the table will continue to grow. Will it grow to be a problem? hope not.
Regarding solution 2
Will having 350 tables (maybe 500 in the future) cause performance issues of its own even though most table are less frequently used.
Implementation
I envision creating a view and/or trigger,procedures. I would create a view with the existing table name and use that to split the data.
In either solution 1 or 2 how would I identify and query only the necessary tables transparently using a view or etc. When homeassistant executes a query it should return all relevant data as if all the data was in a single table.
Using solution 1, The date would have to be extracted so that if the data range exceed a single month the from would have to be altered to include all relevant months. Instead of select * from table where data is between ... I would need that but then the from would be from 062023,052023,042023 (or whatever is relevant)
Using solution 2, For example if homeassistant asks select * from table where sensors =1,2, 3,5,9,11select * from table where sensors =1,2, 3,5,9,11 then query would need to be altered to select * from 1,2,3,5,9,11;select * from 1,2,3,5,9,11;
Would using a view allow me to make the necessary changes transparently?
Is there another solution I had overlooked.
I am using homeassistant with mariadb. This is hosted on a Pi 4 with 8gb of RAM I have been collecting data a long time, and a single table grows to 20gb. The storage is a M.2 NVMe via USB 3. CPU usage is normal less than 20 but climbs to 40%. A few times a day it briefly spike. Memory usage has never been above 2gb with more than 6gb free.
Right now 350 (approx) sensors periodically record data to a single table.
Issues to solve:
- performance gets slower over time
- Too much data in table creates single point of failure. All my data can be gone with a single error.
- Backups are slow and errors occur because the entire table is locked during a backup and new data can't be added to the table when it locked.
Solution I can up with
- Split table by date (1 table per month)
- Split table by sensor number. (Each sensor has its own table)
- Ask community for alternative solutions.
Regarding solution 1
All sensors data is combined so even if I only ask for 12 sensors it has to sift through 350 worth of data. When I add senors, the table will continue to grow. Will it grow to be a problem? hope not.
Regarding solution 2
Will having 350 tables (maybe 500 in the future) cause performance issues of its own even though most table are less frequently used.
Implementation
I envision creating a view and/or trigger,procedures. I would create a view with the existing table name and use that to split the data.
In either solution 1 or 2 how would I identify and query only the necessary tables transparently using a view or etc. When homeassistant executes a query it should return all relevant data as if all the data was in a single table.
Using solution 1, The date would have to be extracted so that if the data range exceed a single month the from would have to be altered to include all relevant months. Instead of select * from table where data is between ... I would need that but then the from would be from 062023,052023,042023 (or whatever is relevant)
Using solution 2, For example if homeassistant asks select * from table where sensors =1,2, 3,5,9,11 then query would need to be altered to select * from 1,2,3,5,9,11;
Is there another solution I had overlooked.
I am using homeassistant with mariadb. This is hosted on a Pi 4 with 8gb of RAM I have been collecting data a long time, and a single table grows to 20gb. The storage is a M.2 NVMe via USB 3. CPU usage is normal less than 20 but climbs to 40%. A few times a day it briefly spike. Memory usage has never been above 2gb with more than 6gb free.
Right now 350 (approx) sensors periodically record data to a single table.
Issues to solve:
- performance gets slower over time
- Too much data in table creates single point of failure. All my data can be gone with a single error.
- Backups are slow and errors occur because the entire table is locked during a backup and new data can't be added to the table when it locked.
Solution I can up with
- Split table by date (1 table per month)
- Split table by sensor number. (Each sensor has its own table)
- Ask community for alternative solutions.
Regarding solution 1
All sensors data is combined so even if I only ask for 12 sensors it has to sift through 350 worth of data. When I add senors, the table will continue to grow. Will it grow to be a problem? hope not.
Regarding solution 2
Will having 350 tables (maybe 500 in the future) cause performance issues of its own even though most table are less frequently used.
Implementation
I envision creating a view and/or trigger,procedures. I would create a view with the existing table name and use that to split the data.
In either solution 1 or 2 how would I identify and query only the necessary tables transparently using a view or etc. When homeassistant executes a query it should return all relevant data as if all the data was in a single table.
Using solution 1, The date would have to be extracted so that if the data range exceed a single month the from would have to be altered to include all relevant months. Instead of select * from table where data is between ... I would need that but then the from would be from 062023,052023,042023 (or whatever is relevant)
Using solution 2, For example if homeassistant asks select * from table where sensors =1,2, 3,5,9,11 then query would need to be altered to select * from 1,2,3,5,9,11;
Would using a view allow me to make the necessary changes transparently?
Is there another solution I had overlooked.
transparently splitting a single table so the program doesn't notice
I am using homeassistant with mariadb. This is hosted on a Pi 4 with 8gb of RAM I have been collecting data a long time, and a single table grows to 20gb. The storage is a M.2 NVMe via USB 3. CPU usage is normal less than 20 but climbs to 40%. A few times a day it briefly spike. Memory usage has never been above 2gb with more than 6gb free.
Right now 350 (approx) sensors periodically record data to a single table.
Issues to solve:
- performance gets slower over time
- Too much data in table creates single point of failure. All my data can be gone with a single error.
- Backups are slow and errors occur because the entire table is locked during a backup and new data can't be added to the table when it locked.
Solution I can up with
- Split table by date (1 table per month)
- Split table by sensor number. (Each sensor has its own table)
- Ask community for alternative solutions.
Regarding solution 1
All sensors data is combined so even if I only ask for 12 sensors it has to sift through 350 worth of data. When I add senors, the table will continue to grow. Will it grow to be a problem? hope not.
Regarding solution 2
Will having 350 tables (maybe 500 in the future) cause performance issues of its own even though most table are less frequently used.
Implementation
I envision creating a view and/or trigger,procedures. I would create a view with the existing table name and use that to split the data.
In either solution 1 or 2 how would I identify and query only the necessary tables transparently using a view or etc. When homeassistant executes a query it should return all relevant data as if all the data was in a single table.
Using solution 1, The date would have to be extracted so that if the data range exceed a single month the from would have to be altered to include all relevant months. Instead of select * from table where data is between ... I would need that but then the from would be from 062023,052023,042023 (or whatever is relevant)
Using solution 2, For example if homeassistant asks select * from table where sensors =1,2, 3,5,9,11 then query would need to be altered to select * from 1,2,3,5,9,11;
Is there another solution I had overlooked.