2

Let's say we have some web crawling system which needs to scale to many workers. Each worker has a limit for example max. 5 HTTP requests per minute per web host that could be crawled.

"per web host" can be in message header and which will be populated by producer.

Can I teach ActiveMQ to dispatch messages to nodes in this fasion?

1 Answer 1

2

ActiveMQ does not support consumer throttling, only producer throttling. Camel does support this though, see the accepted answer here: ActiveMQ throttling consumer

Sign up to request clarification or add additional context in comments.

2 Comments

The Camel Throttler has very limited options. I need to limis "per time period per web host that could be crawled". Using Camel Throttler I can limit only queue throughput over time. I also need to do it per web host or other message header value.
Sounds like your have to roll your own :( You could still queue up the processing requests via ActiveMQ, use queues to sort them by site, and dispatch them to workers. Part of the message could be a throttling limit, in which you use java timing and synchronization primitives to achieve your desired scrape rate. Most everything is designed to run "As fast as possible" :) Kinda rare you have a problem of the opposite

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.