An ECDSA signature encodes the $(r,s)$ integers each in $[1,n)$, where $n$ is the order of the (sub)group generator. For a standard 256-bit prime curve one standard byte form for such signature is 64-byte, formed of the two components encoded big-endian over 32 bytes and concatenated. There are at least two other common ASN.1 formats: one re-encoding the previous form as a BITSTRING, another encoding $r$ and $s$ as a SEQUENCE of two INTEGERs.
Given a (message, signature) pair $(M,(r,s))$, the public key recovery operation allows to find one (of at least two) public keys $Q$ such that $M$ and $(r,s)$ pass verification against $Q$:
- Compute the hash $H$ of $M$ and convert it to an integer $e$ as in steps 2 and 3 of the signing operation.
- For each point $R$ of the curve which X coordinate $x_R$ is such that $r=\overline{x_R}\bmod n$ (where $\overline{x_R}$ is an integer representation of $x_R$, or just $x_R$ for prime curves)
- compute candidate public key $Q=r^{-1}(s\;\!R-e;\!G)$ and somewhat validate it.
When $R$ is a candidate point, so is $-R$ sharing the same X coordinate $x_R$, leading to a different $Q$. Another reason for possibly multiple $Q$ is that there can be up to $h+1$ candidates $\overline{x_R}$ with $r=\overline{x_R}\bmod n$, and among these conceivably several which are the X coordinate of a curve point (where the multiplicity $h$ is the ratio of the curve size to the order of the generator, with $h=1$ for standard prime curves, $h\in\{2,4\}$ for standard binary curves). For standard prime curves over the field $\mathbb F_p$, the primary candidate is $x_R=r$, and there could be the other $x_R=r+n$ only for curves with $n<p$ (e.g. secp256k1 but not secp224k1) and then only if $r$ lie in $[1,p-n)$ which is extremely rare for random $r\in[1,n)$ due to Hasse's theorem, and then only if $x_R=r+n$ corresponds to some valid point (I see no reason why that later condition would not be with probability about 50%).
Thus absent a mean to weed out invalid candidates, standard public key recovery operation can't recover the intended public key with certainty. It's ambiguous. We want to fix that.
In principle, in order to define a format for ECDSA signature allowing unambiguous public key recovery, we need to add:
- One bit to disambiguate between $R$ and $-R$, e.g. to tell if the first byte of the compressed representation of point $R$ is
02or03(the parity of $y_R$ for prime curves). Another option is to require the signer to use $R$ making that byte02. That's sometime done and enforced in order to avoid the issue that from a valid ECDSA signature of a message, it's easy to build one (single) other signature for the same message. - Extra data to reduce to a single $x_R$ for a given $r$. That could be the integer $j$ in $[0,h]$ with $\overline{x_R}=r+j\;\!n$. For prime curves that reduces to one bit, with the value $0$ in practical cases (so much that the other case could be ignored, or that it could be a requirement that $x_R<n$ when making the signature).
Questions:
- What are standard or common uses for ECDSA signature modified for unambiguous key recovery? What augmented signature format or other technique do they use?
- For some standard prime curve with $n<p$ like secp256k1, how hard would it be to build a valid (private key, message, signature) triplet $(d, M, (r,s))$ where the "extra data" is needed, that is with the ephemeral point's X coordinate $r+n$ rather than $n$ as usual?
Update: I'm told that Ethereum uses unambiguous public key recovery, for the curve secp256k1, thanks to a recovery identifier aka recID noted $v$, and formatting $(r,s,v)$ as a 65-byte bytestring with the last byte for $v$. I see conflicting information on the value of that byte, and the treatment of $x_R\ge n$: (311) in this states that $v$ can be in range $[0,3]$, but dismisses signature with $x_R\ge n$ as invalid thus reduce to $v\in\{0,1\}$ equal to the parity of $y_R$. That other source mentions adding the constant 0x1B=27, with values assigned for $x_R\ge n$ (by adding 2) and recovered public key in compressed format (by adding 4), and then more.