6
$\begingroup$

I have a chunk of data that I need to round-trip through a service that I don't trust, and I want to make sure the data hasn't been tampered with in-transit. I have limited memory and limited storage, so I can't hold onto a copy of the original data to compare against. I don't care about the secrecy of the data, only that it hasn't been modified. My data is already serialized as a byte stream. I don't need to demonstrate to any other service that this data hasn't been modified, only to myself.

As I understand it, the "right" way to do this is to store an asymmetric key locally. I send the service the data, an IV, and a signature of the data signed with my local key. When I get the bundle back I can use the data and the IV with my local key to verify that the signature is correct.

But signing things is time-expensive. Hashing things is faster, and I don't need to prove accuracy outside of my own application. Instead I would like to store some secret bytes. I send the service the data and a hash generated from the data and my secret. When I get the bundle back I can re-hash it using the data and my secret and compare the hashes. I never send the service my secret, so I don't see a way for them to tamper with the data and not break the hash.

Is there a problem doing this that I don't see?

Is it safe if I reuse the secret for all messages that I send to the service?

$\endgroup$
2
  • $\begingroup$ What about Schnorr's signature or HMAC? Of course, you can go for $H(m\|k)$ if you insist. $\endgroup$ Commented Oct 9 at 17:15
  • 1
    $\begingroup$ If you're only doing this once (or a known finite number of times), is there a reason you couldn't just store the hash of the data locally? $\endgroup$ Commented Oct 12 at 1:31

2 Answers 2

10
$\begingroup$

In short: use HMAC, your solution may be vulnerable to length extension attacks.

HMAC doesn't have a huge overhead, much smaller than signing. Signing and verifying can also be rather fast though; the speed is mainly an issue for many small messages.

For both solutions - signing using asymmetric cryptography or HMAC - the data is only passed once through the hash once. Only during calculation of the final signature based on the hash value will there be a difference in performance and efficiency.

Note that you may also want to protect against replay attacks, e.g. by including a message sequence number.

$\endgroup$
1
  • 1
    $\begingroup$ Thank you, HMAC looks like the way to go. In this particular case replay attacks aren't a concern; we're just worried about whether the data originated from a trusted source, not whether the data is stale / duplicated. But good call-out anyway. $\endgroup$ Commented Oct 9 at 17:57
10
$\begingroup$

Hashing things is faster, and I don't need to prove accuracy outside of my own application.

Then you don't need public-key signatures. The point of public-key signatures is that they enable separation of signing and verifying powers: you can enable one party to verify signatures without granting them the power to create signatures.

Instead I would like to store some secret bytes. I send the service the data and a hash generated from the data and my secret. When I get the bundle back I can re-hash it using the data and my secret and compare the hashes.

What you do need is a message authentication code or MAC, such as HMAC-SHA256. A MAC takes a secret key and a message, and returns a short string variously called a (MAC) tag, an authenticator, or just a MAC itself where unambiguous. The security goal of a MAC is to prevent anyone who doesn't know the secret key from forging any authenticators on messages that weren't already authenticated using the secret key.

If you naively use t = SHA256(k || m) as an authenticator, then an adversary finding t can easily forge t' = SHA256(k || m || padding(length(k || m)) || m') for any suffix m', where padding(length(k || m)) is the public padding function of SHA-256; knowledge of k and even m is not required to pull this off. This is called a length extension attack. HMAC thwarts length extension attacks; so do newer designs like SHA-3, but SHA-3 also provides a dedicated MAC construction called KMAC. This is why you should use HMAC or similar rather than using SHA-256 directly as you described.

Anyone who can verify authenticators, of course, can also create them—but that's not relevant in your application.

You may also need to prevent reordering and/or replay attacks: If an adversary operating the transit service records several messages with valid authenticators, the adversary can pass them on in any order, omitting some or repeating some. What messages they do pass on are all valid, but you may need to detect the reordering or replay. You might use a message sequence number for this—requires some storage at the endpoints, but only enough to record the number of messages sent or received so far.

(A unique input per message is also required for some MACs, like Poly1305-AES or AES-GMAC—the message sequence number can serve for that purpose too, in addition to detecting reordering and replays.)

$\endgroup$
1
  • 1
    $\begingroup$ I suggest writing t' = SHA256(k || m || padding || m') just to emphasize the sha256 padding only adds some gibberish to the end of the message and isn't an arbitrary function $\endgroup$ Commented Oct 12 at 12:46

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.