Timeline for How to rebalance data across nodes?
Current License: CC BY-SA 4.0
15 events
| when toggle format | what | by | license | comment | |
|---|---|---|---|---|---|
| Aug 7 at 0:09 | history | bumped | CommunityBot | This question has answers that may be good or bad; the system has marked it active so that they can be reviewed. | |
| Apr 9 at 0:05 | history | bumped | CommunityBot | This question has answers that may be good or bad; the system has marked it active so that they can be reviewed. | |
| Dec 9, 2024 at 23:02 | history | bumped | CommunityBot | This question has answers that may be good or bad; the system has marked it active so that they can be reviewed. | |
| Aug 11, 2024 at 22:04 | history | bumped | CommunityBot | This question has answers that may be good or bad; the system has marked it active so that they can be reviewed. | |
| Jul 13, 2024 at 21:22 | history | protected | gnat | ||
| Jul 13, 2024 at 19:50 | comment | added | amon | If this is a cloud-first design then storage will be virtualized anyway. You may be able to share storage, or reassign volumes to another compute node with little or no downtime. Actually moving data between nodes would make scaling down more difficult. However, your main problem isn't storing/moving data, but designing a distributed system. Your problems are shared with other databases that support sharding and/or replicas. Solutions will depend on your CAP priorities/ desired consistency model. Aside from "B doesn't have the data yet", also consider "B has crashed/frozen". | |
| Jul 13, 2024 at 14:30 | comment | added | poundifdef | The goal is for the system to automatically be able to scale up and down depending on load. I assume that rebalancing data is necessary here to be able to scale in without waiting for a single node to have 0 messages. | |
| Jul 13, 2024 at 14:28 | comment | added | poundifdef | I am implementing my own queue from scratch, with the semantics of SQS. | |
| Jul 13, 2024 at 4:29 | comment | added | amon | In Kafka, changing the number of partitions only affects future messages. Existing messages remain in their partition and aren't moved around. However, brokers (nodes) in a cluster can fetch data from each other. So a client only has to connect to any node in the cluster, not to the node that happens to store the data. Perhaps you could implement a similar mechanism to internally route message reads to the correct node. But Kafka also makes some decisions (messages are immutable, clients pull messages) that make this feasible. | |
| Jul 13, 2024 at 1:18 | comment | added | Greg Burghardt | And by "implementing a message queue" do you mean you are choosing from an existing solution, or are you literally writing the code which will serve as a message queue? | |
| Jul 13, 2024 at 1:16 | comment | added | Greg Burghardt | Which kind of message queue are you using? For example, RabbitMQ allows for clustering. Is that similar to what you need? | |
| Jul 12, 2024 at 21:38 | answer | added | anon_user123456 | timeline score: 0 | |
| Jul 12, 2024 at 21:31 | history | edited | poundifdef | edited tags | |
| S Jul 12, 2024 at 21:21 | review | First questions | |||
| Jul 14, 2024 at 22:51 | |||||
| S Jul 12, 2024 at 21:21 | history | asked | poundifdef | CC BY-SA 4.0 |