1

My understanding is that threads exist as a way of doing several things in parallel that share the same address space but each has its individual stack. Asynchronous programming is basically a way of using fewer threads. I don't understand why it's undesirable to just have blocking calls and a separate thread for each blocked command?

For example, suppose I want to scrape a large part of the web. A presumably uncontroversial implementation might be to have a large number of asynchronous loops. Each loop would ask for a webpage, wait to avoid overloading the remote server, then ask for another webpage from the same website until done. The loops would then be executed on a much smaller number of threads (which is fine because they mostly wait). So to restate the question, I don't see why it's any cheaper to e.g. maintain a threadpool within the language runtime than it would be to just have one (mostly blocked) OS thread per loop and let the operating system deal with the complexity of scheduling the work? After all, if piling two different schedulers on top of each other is so good, it can still be implemented that way in the OS :-)

It seems obvious the answer is something like "threads are expensive". However, a thread just needs to keep track of where it has got to before it was interrupted the last time. This is pretty much exactly what an asynchronous command needs to know before it blocks (perhaps representing what happens next by storing a callback). I suppose one difference is that a blocking asynchronous command does so at a well defined point whereas a thread can be interrupted anywhere. If there really is a substantial difference in terms of the cost of keeping the state, where does it come from? I doubt it's the cost of the stack since that wastes at most a 4KB page, so it's a non-issue even for 1000s of blocked threads.

Many thanks, and sorry if the question is very basic. It might be I just cannot figure out the right incantation to type into Google.

1 Answer 1

1

Threads consume memory, as they need to have their state preserved even if they aren't doing anything. If you make an asynchronous call on a single thread, it's literally (aside from registering that the call was made somewhere) not consuming any resources until it needs to be resolved, because if it isn't actively being processed you don't care about it.

If the architecture of your application is written in a way that the resources it needs scale linearly (or worse) with the number of users / traffic you receive, then that is going to be a problem. You can watch this talk about node.js if you want to watch someone talk at length about this.

https://www.youtube.com/watch?v=ztspvPYybIY

Sign up to request clarification or add additional context in comments.

3 Comments

Many thanks. The first part of your answer makes sense. However, you agree that a blocked async call takes some resources. So the resource consumption will be linear in their number, much like it would with one thread per call. Hence the second part probably isn't right. The link helps, but it basically says "everyone knows that thread-per-connection is very bad" rather than why - the stack gets a mention but it's unclear why it matters since it takes up trivial space (or so it seems to me). Do I really not need asynchronous programming if I have a spare 1GB for the stacks?
A thread will usually have to create new instances of any variables it needs to access (which is what I meant by preserving state), which increases the cost in terms of memory. It is also not free for a processor to switch from one thread to another, whereas if you're processing events in a loop you never have to restage anything. While it is true that the requests have to be cached, the memory space for these is probably reserved up front, and only increased in the case of heavy usage. You could even write these to disk for particularly long-living requests.
Many thanks for the followup. I think I probably understand how the basics work. I just don't know what fundamental reason there is for the OS not being able to store thread state nearly as concisely as a closure can. The need to allocate the stack memory in multiples of page size is an obvious candidate, but it's unclear that cost is worth avoiding. Pre-emptive vs co-operative muti-tasking could also be a part of it, but I doubt that makes a huge difference. Perhaps the answer is just: no fundamental reason, but current OSs and language runtimes are not optimised to run many threads?

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.