Essentially, while IO happens, the thread that encountered an await will be free to pick up other requests. This improves the throughoutthroughput of the web application. The fundamental reason behind this is that IO is not done by the CPU, but by the various IO devices on the PC (disk, network card, etc.),; the CPU merely coordinates them. Synchronous calls will simply block the application thread waiting (essentially meaning one CPU core -core—it doesn't matter which - iswhich—is performing the sync wait - OSwait—OS scheduling has little effect on this outcome) for the IO device to finish, which is not an ideal measure for maximum throughput.
- Server binds listener port.
- Connection incoming
- Socket is created between server and client.
- Request is assigned to a thread pool thread, and beginning processing begins -> this is where your async happens.
- Listener is again free to serve a new connection. Repeat 2-4
There's a hard limit on how many requests can be processed at the same time. As you correctly identified, there is a limit on how many sockets you can have open, and you cannot simply close the socket on someone. That is true. However, the limit of sockets is in the range of tens of thousands, whereas the limit on threads is in the thousands. So in order to fully saturate your sockets, which is ideal 100% usage of hardware, you need to better manage your threads, which is where async await-await comes in.
- IO is async in nature. The synchronous IO wait happens in the application level APIs and libraries (even if they are provided by the OS).
- Async await-await allows applications to fully adapt to the async nature of IO.
- We are not talking about Task.Run here, its use case is different, async await-await is used there for convenience.