Without the state parameter, the client has no clue whether the authorization code it receives at the redirection endpoint actually belongs to the current user, or whether an attacker has tricked the user into making the request with a code from the attacker. As a result, the client may wrongly associate an attacker-controlled resource with the current user. If the user performs operations on the resource (e.g., adding sensitive data), the client will make corresponding requests to the resource server. As the attacker is the actual resource owner, they can, for example, see added data.
The attack works as follows:
- Attacker A creates a protected resource at some resource server.
- Then A picks a vulnerable client (e.g., a website) which doesn't use the
state parameter and is allowed to send OAuth requests to the resource server. Additionally, A chooses a victim V that is a user of the client. - Now A triggers the OAuth procedure, pretending to give C access to the protected resource. This causes C to construct a URL for the authorization request and send A to that URL.
- A authenticates at the authorization server and gives C permission to access the resource. However, instead of following the redirection URL in the authorization response, A copies that URL (which includes the authorization code) and creates a link intended for the victim V.
- If V follows the link, this triggers a CSRF attack: V unknowingly makes a request to the redirection endpoint with an authorization code associated with the attacker-controlled resource.
- After C has received the authorization code through the redirection endpoint, it will request access to the protected resource, incorrectly assuming that the resource belongs to the current user V.
- Whenever V accesses “their” resource through C, they're actually accessing a resource controlled by A. For example, if V adds sensitive data in the resource, C forwards that data to the resource server, and A (who is the actual owner) can see it.
The state parameters is a simple but effective way to fix this: When constructing the URL for the authorization request, the client generates a parameter which binds the request to the currently authenticated user and cannot be forged by an attacker. One possibility is a cryptographic hash of the current session ID (which no other user should know, preventing forgeries). Note that using the session ID itself would be a poor choice, as the URL may be transmitted or stored in an insecure manner (this is explicitly pointed out in the OAuth specification). A cryptographic hash, on the other hand, doesn't reveal the original input.
When the authorization server sends a user to the redirection endpoint, it includes the state parameter it received in the original authorization request, together with the authorization code. This allows the client to check whether the authorization code belongs to the current user or a CSRF attacker which has tricked the current user into following a prepared link. This completely prevents the attack above, because the attacker cannot simply copy their own redirection URL which includes their state parameter. They have to forge/guess a victim's state parameter, and this should be impractical (given a correct implementation).
So the client goes through the following steps when using the state parameter.
- When the current user wants to go through the OAuth procedure, the client generates a
state parameter which is tied to the current user and cannot be forged, e.g., sha256(current_session_id). - This parameter is added to the URL for the authorization request, together with
response_type=token, the client_id, the redirect_uri etc. The user is sent to this URL as usual. - When the authorization server sends the user back to the redirection URL, it includes both the authorization code and the original
state. Now the client can check whether the state is associated with the current user, e.g., by checking state == sha256(current_session_id). If this is the case, then the request to the redirection URL has been made by the user as part of the normal OAuth procedure. If the values don't match, then something went wrong, and the client is supposed to reject the request. In the special case that the state belongs to a different user, this strongly indicates a CSRF attack.
As to the diagram:
What you've shown isn't really vanilla OAuth but an authentication protocol on top of OAuth, probably OpenID Connect. The original OAuth isn't intended to provide a log-in via third party sites like Google but to share specific resources (like Instagram photos). The steps are still correct, as far as I can see. However, it's important that the client assembles the URL for the authorization request (which in your case goes to /authorise), so that's where the client generates the state parameter. This parameter is then checked at the redirection endpoint, in your case /handle-auth.