Is there something better than async in order to avoid dead or unused processing times in single thread?

Question

I recently read about PHP's true async RFC.

The initial real life "picture" (unrelated to PHP or sync processing) that I imagined was a waiter in a restaurant. He/She/It works async, taking orders from the customers, delivering the orders to the kitchen, cleaning up tables, bringing orders to the tables, cashing in the money and so on, all async.

Then I remembered about the lost time that gets in between all of these processes and I pictured this visual simple representation of it (time laps from top to bottom like rain drops fall thanks to gravity):

I noticed this also while using Javascript to simulate in parallel some (automotive acceleration) graphs in my portal.

The idea that I understood was that the worker can switch "tasks" only at certain points NOT when the response came from DB or from the HTTP request. Also I understood that the scope "sticks" with the task and not with the worker.

All is fine so far but I wonder. Is there anything better than this? This seems like gambling with lost and gained processing time.

Laravel Octane (if I understood right) works pretty much in the same way, not waiting for DB OR HTTP responses but that is in another context (outside of one http request's scope or CLI command's scope) than my question.

PHP 8.1 Fibers work kind of in the same way with the difference that the developer decides (hardcoded in his code logic) when the worker should switch tasks/fibers.

I understand that the number of workers must be limited and that starting as much needed workers as possible to eliminate the wasted time is improbable to happen.

Still... if the scope is not carried around by the worker, in theory, starting as many workers as needed could help eliminating the dead/unused time. Ideal (if possible) each worker that does not have a task to pick on, should be stopped and also when there is no available worker to pick up a task, a new worker should be started.

Of course there should be a hard limit on the max number of workers that can coexist. When that hard limit is reached the wasted time makes its way into the situation.

I have no idea how this would behave, that is the reason for asking this question. Maybe the async is already like this and I did not understand it right.

Another reason why I ask this is because the PHP RFC would imply huge work to be done that (maybe) would generate issues and if the end result is not "perfect", I wonder what would happen by spending that amount of resources on a better approach, better than async if that solution CAN exist.

What I tried for example, is running the queries for eager loading from Eloquent concurrently via mysqli_poll for bind-less selects and async via curl_multi_init but I saw that the time gained is less than the time spent on initiating those parallel HTTP requests if each query would generate a call. Then I split the queries in two batches when the number of relations is over 30 for example.

So when I think of async I always relate to real life needs that can be covered by it or by something similar to async.

UPDATE 2025.11.26

It seems the True Async RFC from PHP falls into the first picture logic with a single thread. So my question is about that scenario: a request or a command situation.

UPDATE 2025.11.28

For single threaded request/command situation the only situation where a time improvement could be possible is when a call is made to outside with some info (not with the whole scope) and a waiting time is needed. So maybe going on that path could be a BETTER alternative to async (curl_multi_init, mysqli_poll, starting parallel new processes like laravel does with their concurrency, amphp/mysql package that already uses php fibers etc).

This would be the current doable situation via the above:

Can it be that the worker starts processing the response from call 1 as soon as it arrives with the condition that it is after the call 3 was initiated while the worker waits? In current implementation or new feature implementation of PHP I mean. Something like this:

"Maybe the async is already like this": yes, I think so. I don't know about PHP but in general with async, the web server creates more worker threads than cores, and your second picture describes how it works: there's no wasted time. When all workers are busy and a response came back from DB, that task has to wait for the next available worker, but it also means that that worker was working on another task that will be served that much faster: it's not a waste. — Eldritch Conundrum
– Eldritch Conundrum, Commented Nov 25 at 11:50
This is also how Kotlin coroutines work, by spawning a new worker (if needed) for every continuation. — Seggan
– Seggan, Commented Nov 25 at 13:47
@EldritchConundrum thank you. By waste I meant that it could had been used to process something. When the max worker limit is reached that time would not been used to process something. — marius-ciclistu
– marius-ciclistu, Commented Nov 25 at 16:28

Eric Lippert · Accepted Answer · 2025-11-25 18:12:23Z

The initial real life "picture" that I imagined was a waiter in a restaurant. He/She/It works async, taking orders from the customers, delivering the orders to the kitchen...

This is a good analogy that I use frequently. The waiter is a worker, there is a list of tasks, some of the tasks are being performed by other workers / hardware / etc. There is a partial ordering of tasks -- the order can't go in until the customers are seated, and so on.

The idea that I understood was that the worker can switch "tasks" only at certain points NOT when the response came from DB or from the HTTP request.

Correct. In an asynchronous workflow in languages with an await operator, those "certain points" are the awaits. An await is a point where progress cannot be made on the current task until another task is completed, so we must asynchronously wait -- await -- the other task's completion.

Is there anything better than this?

Better by what metric for betterness?

We added asynchronous workflows to C# for two scenarios:

(1) the "worker" is also updating the user interface and must not be blocked for more than 30ms. Asynchrony is not to increase efficiency of the workflow, rather, it is to keep the UI responsive by creating many opportunities for the worker to refresh the UI. The time taken to perform any given task is not a concern as long as it gets done eventually.

(2) the "worker" is one of a limited number of available workers on a server doing compute-heavy work; if any worker becomes blocked on I/O, immediately reassign that worker to an available compute task; when the I/O completes, eventually a worker will be assigned to finish it up. Asynchrony is for increasing the efficiency of the workflow by assigning scarce workers to a large number of compute tasks. The time taken to complete any one task will be decreased if there are more workers available to take on tasks.

Note that the "await" operator does not create asynchrony. Its purpose is to make asynchronous workflows read more nicely on the page by clearly identifying where the suspension points are in the workflow, and letting the compiler deal with the resulting messy code generation.

Those were our metrics for betterness; what's yours?

If the scope is not carried around by the worker, in theory, starting as many workers as needed could help eliminating the dead/unused time. Ideal (if possible) each worker that does not have a task to pick on, should be stopped and also when there is no available worker to pick up a task, a new worker should be started.

That's my scenario 2 above. You have reinvented thread pools! Note that ideally we want only as many threads as there are CPUs available, otherwise we once again have your "stopped worker" problem, only now its the OS scheduler suspending threads at moments of its choosing, not at awaits.

What you're describing is exactly how Active Server Pages manages compute-heavy workflows. Tasks are whenever possible NOT affinitized to any particular thread so that when one awaits, that thread can be returned to the pool and assigned to any task ready to have its completion executed. If threads are killed, new worker threads can be created. The thread manager self-balances to try and maintain high CPU utilization and low context switch penalty.

"Better by what metric for betterness?" For shortening the execution time of the request or command. "Those were our metrics for betterness; what's yours?" To give just an example, if a http call must update a table and also its related tables, if the request can respond in the maximum time needed for any of that updates to happen, that would be the target. Analog with how curl_multi_init works, when you call multiple endpoints. I have this situation in a rest lib I built where you update a resource (plain table) and the lib undecorates the request and updates the table + 1:1 related tables. — marius-ciclistu
– marius-ciclistu, Commented Nov 25 at 18:56
But now I remember that I do the update in a DB transaction so maybe that is not the best example because I am not sure how would that fit into the picture. — marius-ciclistu
– marius-ciclistu, Commented Nov 25 at 19:02
@marius-ciclistu: keep thinking about your restaurant analogy. You can optimize a restaurant for shortest time between customer arrival and first bite -- that's fast food. You can optimize for fastest turnover of tables, you can optimize for highest profit per night, and all of these require different techniques. All of these have analogues in your scenario. For example, when I was working on Active Server Pages back in the 1990s we were very concerned about "time to first byte" -- how long does it take from HTTP request to first byte of HTML back. Because the browser can start rendering then. — Eric Lippert
– Eric Lippert, Commented Nov 25 at 21:37
We could also optimize for time to last byte served. We could optimize for bytes served, for concurrent sessions, for web site "snappiness" in the UI, or for cost metrics on the back end. Any of these are possible and they all require different approaches. Asynchronous workflows are an important tool to have in your toolbox, but make sure you're attacking the right problem. — Eric Lippert
– Eric Lippert, Commented Nov 25 at 21:39

André L F S Bacci · Accepted Answer · 2025-11-25 18:11:52Z

Is there anything better than this? Yes, there are: green processes, and call tree frames.

Processes

Let's start with processes. On a POSIX shell, it's perfectly possible to run things like this:

cat file | gunzip | sort | uniq | wc -l

That is, read some file, uncompress it, sort the lines, filter out repetitive lines, count these lines. Or in simple words, "multi threaded count the unique lines on a compressed file".

You may be asking where the "multi threaded" bit comes from. Well, on a machine with only one central unit processor (CPU), only one of these programs can run each time. When cat runs, none of others can run, and so on.

But there is a catch. What modern commercial ou user machine nowadays that comes with only one CPU?

If there is more than one CPU per machine, it's perfectly possible to read (cat), to uncompress (gunzip), to sort (sort), to remove repetitive lines (uniq) and to count (wc) separate parts of the data in fully paralell manner.

Instead of "every one except one is blocked" design, there is "everyone is runnable", only data flow may modify or block the processing. Incidentally, sort will not output any data before receiving and processing all the data, so in this example the "max parallel index" is only 3, even though in theory all 5 processes are independent. (only cat gunzip sort and sort uniq wc will ever run in parallel.)

But making all of everything a separate process is not only a huge waste, as it will also cause huge inefficiencies. Each input and output byte stream is a serialization and deserialization point, but also a choke point. In this model, only fully complete messages can be exchanged, cannot be interrupted, or otherwise cascading catastrophic failures are basically guaranteed to follow.

Green processes

So instead of full system processes, think of some multi threaded language that do not offer threads, but "green processes" instead. That is, "threads without shared memory".

A language like this can eliminate all the manual serialization and deserialization code (but not the "serde" execution altogether), and also eliminate choke point problems, as the compiled program can have many, per inner process, input/output points.

Think of Go, but without shared memory. Everything that passes an inter-process channel is fully automatically serializable.

Call tree frame

This is nice, but we have not touched the async side of things. And the call stack is a very nice and effective structure to store call frames. But there is also a catch. The existence of a call stack frames makes having everything async very hard.

So a true async language will probably not indicate or implement any form of native call stacks, and instead will primarily allocate all call data that we call the heap.

This will not be as fast as having all and everything allocated in call stacks, nor is the objective. The objective is that this new waste will be vastly inferior compared with the waste that is eliminated, by removing all costs of locks, thread switches and bugs caused by shared memory in multithreaded programs.

Decoupling concurrency and parallelism

async is very convenient to write concurrent code, but multithreading with shared memory makes things hard. Keeping the parallelism and eliminating the shared memory refers to green processes and invisible serialization of simple data between internal processes. The rest is syntax.

In other words: async for concurrency, processes for parallelism.

Thank you. You described it, from what I was able to understand, from a multi-thread perspective while my question's ecosystem was a request or a command, which can be considered a "single thread" (maybe I am wrong). Your description falls more in the Octane (Swoole or RoadRunner) situation. At least that is what I can understand from it. Anyway thank you. — marius-ciclistu
– marius-ciclistu, Commented Nov 25 at 18:49
By the looks of it, your alternatives might be the answer as PHP's async is described by the first picture with just 1 worker and multiple coroutines.... — marius-ciclistu
– marius-ciclistu, Commented 2 days ago

Stack Exchange Network

Is there something better than async in order to avoid dead or unused processing times in single thread?

2 Answers 2

Processes

Green processes

Call tree frame

Decoupling concurrency and parallelism

You must log in to answer this question.

Hot Network Questions