Skip to main content

Timeline for How to rebalance data across nodes?

Current License: CC BY-SA 4.0

15 events
when toggle format what by license comment
Aug 7 at 0:09 history bumped CommunityBot This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
Apr 9 at 0:05 history bumped CommunityBot This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
Dec 9, 2024 at 23:02 history bumped CommunityBot This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
Aug 11, 2024 at 22:04 history bumped CommunityBot This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
Jul 13, 2024 at 21:22 history protected gnat
Jul 13, 2024 at 19:50 comment added amon If this is a cloud-first design then storage will be virtualized anyway. You may be able to share storage, or reassign volumes to another compute node with little or no downtime. Actually moving data between nodes would make scaling down more difficult. However, your main problem isn't storing/moving data, but designing a distributed system. Your problems are shared with other databases that support sharding and/or replicas. Solutions will depend on your CAP priorities/ desired consistency model. Aside from "B doesn't have the data yet", also consider "B has crashed/frozen".
Jul 13, 2024 at 14:30 comment added poundifdef The goal is for the system to automatically be able to scale up and down depending on load. I assume that rebalancing data is necessary here to be able to scale in without waiting for a single node to have 0 messages.
Jul 13, 2024 at 14:28 comment added poundifdef I am implementing my own queue from scratch, with the semantics of SQS.
Jul 13, 2024 at 4:29 comment added amon In Kafka, changing the number of partitions only affects future messages. Existing messages remain in their partition and aren't moved around. However, brokers (nodes) in a cluster can fetch data from each other. So a client only has to connect to any node in the cluster, not to the node that happens to store the data. Perhaps you could implement a similar mechanism to internally route message reads to the correct node. But Kafka also makes some decisions (messages are immutable, clients pull messages) that make this feasible.
Jul 13, 2024 at 1:18 comment added Greg Burghardt And by "implementing a message queue" do you mean you are choosing from an existing solution, or are you literally writing the code which will serve as a message queue?
Jul 13, 2024 at 1:16 comment added Greg Burghardt Which kind of message queue are you using? For example, RabbitMQ allows for clustering. Is that similar to what you need?
Jul 12, 2024 at 21:38 answer added anon_user123456 timeline score: 0
Jul 12, 2024 at 21:31 history edited poundifdef
edited tags
S Jul 12, 2024 at 21:21 review First questions
Jul 14, 2024 at 22:51
S Jul 12, 2024 at 21:21 history asked poundifdef CC BY-SA 4.0