-1

I'm tinkering with the new streaming plugin provided with RabbitMQ, and researching how to implement event driven architectures in general.

I noticed in the default configuration (maybe I have it configured wrong) if I have one app consisting of two consumers - and one producer produces 5,000 messages that both consumers receive all 5,000 messages, this seems like a big waste of resources, ideally each should receive 2,500.

My production environment has many apps which would be running multiple consumers of their own per app, is it desired in a streaming platform that all the consumers receive all the messages?

Is it my responsibility to track each message received and determine if it had already been processed? Am I forced to reprocess payloads multiple times? or is there typically a way in other streaming platforms like kafka or amazon kinesis to fan messages out based on a consumer clients name?

RabbitMQ released their streaming plugin recently, so I have no idea how feature rich it is compared to its competitors. Since we use RabbitMQ already I haven't wanted to implement a new product if I can handle my message processing needs within RabbitMQ

2 Answers 2

1

Kafka has the concept of a Consumer Group, where any consumers that belong to the same group 'share' the messages from a single topic between them. It's a little more complicated behind the scenes and requires that you have topic partitions set up properly for your use case, but the effect is what you're describing - each consumer receives a relatively even proportion of the overall set of messages.

Kinesis apparently has something similar as per this StackOverflow question: https://stackoverflow.com/questions/50190549/how-does-kinesis-achieve-kafka-style-consumer-groups. I haven't worked with it personally.

I haven't worked with RabbitMQ - a bit with VerneMQ, but never needing to split messages between consumers. From a quick look it doesn't appear that RabbitMQ supports this explicitly - but 'consumer group' is the term you want to search for to find more information.

Alternatively, if your consumers are logically independent and only need to receive certain events from the set, you could look at dividing your topics up into logical levels when you publish the messages, allowing consumers to subscribe only to the subset that applies to them. So instead of consumers pulling everything from:

productionapp/event/# 

You could have two consumers each subscribed to one of:

productionapp/event/foo/# productionapp/event/bar/# 

Where foo and bar represent different sets of events that an individual consumer might be interested in. However, if your two-consumer setup is just to spread the processing load over a set of similar events, you don't want to do this - setting up two topics to have the same event sets but split 50/50 isn't the way to use topics and won't scale well.

1

I noticed in the default configuration (maybe I have it configured wrong) if I have one app consisting of two consumers - and one producer produces 5,000 messages that both consumers receive all 5,000 messages, this seems like a big waste of resources, ideally each should receive 2,500.

It's been a while since I've worked with RabbitMQ so things may have changed, but this wasn't the default behavior. The current documentation seems to indicate that nothing has changed and, if messages are being published to one queue, RabbitMQ will use round-robin dispatching to send items from the work queue to each worker without duplication.

If multiple consumers are receiving the messages, perhaps the publish/subscribe configuration is being used.

My production environment has many apps which would be running multiple consumers of their own per app, is it desired in a streaming platform that all the consumers receive all the messages?

I can think of some cases where it would be desired that multiple (and perhaps all) consumers receive some or all of the messages. Consumers may have overlapping interests. This is a property of the system and not the streaming platform. I would generally want the streaming platform to support both message queuing and publish/subscribe.

Is it my responsibility to track each message received and determine if it had already been processed? Am I forced to reprocess payloads multiple times? or is there typically a way in other streaming platforms like kafka or amazon kinesis to fan messages out based on a consumer clients name?

This depends on the system, but I'd generally say that this should be a responsibility of the messaging or streaming platform. The publisher should communicate with a message broker to ensure reliable deliver. The broker should make messages available to consumers or deliver messages to consumers, depending on if it is supporting a pull or push model. There may be a need for the broker to handle receiving and deconflicting messages from multiple redundant publishers. The broker should also support the required delivery times (once, at least once). If the broker doesn't support any of these directly, then it may be necessary to implement them in the publisher or the consumers. Between message queuing and fanout exchanging and publish/subscribe functionality, many of these requirements can be met. Publish confirmation and support for clusters may handle others.

2
  • is there an alternative to publish/subscribe? at least within the Ruby docs for RabbitMQ thats the only listed option rubybunny.info/articles/exchanges.html Commented Nov 29, 2021 at 18:22
  • @alilland I've never used RabbitMQ in a Ruby context, but it appears that there are several libraries and ways of interacting with RabbitMQ using Ruby. If the RabbitMQ functionality isn't exposed in one library, perhaps you'd need another library. Perhaps searching Stack Overflow would be helpful. Commented Nov 29, 2021 at 18:58

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.