Reading your question, I believe the confusion stems from where the replay attack is taking place. The nonce prevents replay attacks against the client (your application), not the authorization server.

I understand that you're using the code flow, but for the sake of simplicity, let's assume you're using the implicit flow in a single-page application and you have no back-end server (e.g. there's no server-side session). Here's the attack vector that the nonce helps to mitigate:

1. The client application redirects the user agent to the auth server with a `response_type` of `id_token`.
2. The user establishes identity with the auth server, i.e. logs in.
3. The auth server redirects the user agent back to the client application with an `id_token`. The response looks something like this: https://your-single-page.app/auth#token_type=bearer&state=some-state&id_token=some-token
4. An attacker obtains that response, possibly via packet sniffing, client server logs (e.g the web server that's hosting your SPA), the browser's developer tools, shoulder surfing, or some other means.

The nonce helps to prevent the attacker from taking the authorization server's response, pasting it into their URL bar, and establishing identity with your client application. Here's how:

1. The client application generates a secure random nonce and stores it as is, in clear text, in a cookie, session storage, or somewhere persistent.
2. The client application hashes that nonce and sends the hash as an authentication request parameter.
3. When the client application handles the authentication response, it pulls and removes the nonce from persistent storage, hashes it, and compares it against the nonce in the `id_token`. If they don't match, then the client application refuses to establish identity.

The attacker may have intercepted the response, including the `id_token`, but here the nonce effectively acts like a password for the client application. The attacker would need the clear-text nonce to directly establish identity with the client application. (I say "directly" because, depending on the application, an attacker may be able to bypass the nonce check.)

The same attack vector is present when using the code flow, albeit harder to successfully leverage. As an example, imagine an attacker intercepts the [authentication response][1]. The attacker could then paste the response (the 302 location) into their URL bar and get your client application to make a token request. When the authorization server responds, your client application can verify the nonce in the ID token against something that your server has tied to your user's user agent (e.g. a cryptographically random value that's stored in an HTTP-only cookie). Again, the nonce acts as a password *for your client application*. I'll point out that to successfully exploit this exact attack vector there's an assumption that the authorization server does not verify that an authorization code has already been used. That's optional in the spec: "If possible, verify that the Authorization Code has not been previously used."

In my opinion, it would be difficult to leverage this type of replay attack when using the code flow if the authorization server prevents authorization codes from being reused (and it should). However, replaying ID tokens with the implicit flow is trivial given access to authentication responses. As such, the nonce is optional in the code flow and required in the implicit flow. 

 [1]: https://openid.net/specs/openid-connect-core-1_0-final.html#AuthResponse