Security of "hash of hashes"

Question

Hello guys of crypto stack.

I am working on a project to hash data blocks, and i have a question that maybe someone here can clarify me. Its about hash functions:

Let’s say I have a data package, Data, and over it I apply a hash function (for instance sha256), resulting in X:

X = sha256(Data)

Now suppose I break this data package into N pieces, Data1, Data2, Data3... DataN, and apply the same hash function to each piece; I will have:

h1 = sha256(Data1)

h2 = sha256(Data2)

h3 = sha256(Data3)

...

hN = sha256(DataN)

For last, let’s say I apply the same hash function over the hashes h1, h2, h3... hN concatenated, obtaining Y:

Y = sha256(h1, h2, h3,..., hN)

Considering that the entire data package was processed by the sha256 function in obtaining both X and Y:

From a cryptographic process perspective, is Y as secure as X?.

If it is not, why?

Thanks in advance.

How is the number N of pieces chosen, and their individual sizes? What if Data has size a large prime, e.g. $2^{2^{41}}+1$ bytes? Is the hash of a short piece of Data (e.g. 64-byte) SHA-256(Data), or SHA-256(SHA-256(Data)) ? — fgrieu
– fgrieu ♦, Commented Nov 13, 2024 at 12:31
@fgrieu, N of pieces depends of the size of block, that is choosen by user. Lets say that you have a file of 1 TB size, and choose a blocksize of 50MB, so N will be 1TB / 50MB ... The last block has a remainder part, so it can be smaller... the idea is full described in thash.org with some pictures to clarify. Thanks for the comment! — Toni Jr
– Toni Jr, Commented Nov 13, 2024 at 20:54

poncho · Accepted Answer · 2024-11-13 13:35:55Z

$Y$ is as secure as $X$ in this sense: if you can demonstrate a collision in $Y$, you can demonstrate a collision in $X$ (SHA256).

Here's how: suppose we have a collision in $Y$, that is, two nonidentical vectors $(Data1, Data2, ..., DataN)$ and $(Data1', Data2', ..., DataM')$.

If $N = M$ and $SHA256(DataX) = SHA256(DataX')$ for all $X$, then (ecause we assumed that the vectors were nonidentical, that is, there is an $I$ for which $DataI \ne DataI'$, we have a SHA256 collision $DataI, DataI'$

If $N \ne M$ or if $SHA256(DataI) \ne SHA256(DataI')$ for some $I$, then we have a SHA256 collision between the inputs $SHA256(Data1) || ... || SHA256(DataN)$ and $SHA256(Data1') || ... || SHA256(DataM')$

Hence, if we cannot find a collision in $SHA256$, we cannot find a collision in $Y$

That said, from a practical standpoint, if I can achieve Y much faster than X with the same level of security, there's no reason not to choose the method that yields Y over X, right? Thanks for the comment, @poncho . — Toni Jr
– Toni Jr, Commented Nov 13, 2024 at 21:36
Well, standardization is a thing. For starters, you should describe and possibly parameterize your protocol. Other tools will probably not work, as they don't know your protocol. — Maarten Bodewes
– Maarten Bodewes ♦, Commented Nov 14, 2024 at 21:45
Note: this security analysis requires that even for short messages, the hash uses two layers. If by exception the hash of a short piece of Data (e.g. 64-byte) was SHA-256(Data), it would be easy to exhibit a collision. — fgrieu
– fgrieu ♦, Commented Nov 15, 2024 at 7:24

kodlu · Accepted Answer · 2024-11-13 19:45:42Z

Too long for a comment:

In similar spirit to the other answer, $Y$ is as secure as $X$ in this sense: if you can demonstrate a first pre-image attack in $Y$, you can demonstrate a first pre-image attack in $X$ (SHA256) (this argument does not work for the second pre-image attack, @poncho points out).

Suppose we have a firs pre-image attack in $Y$, that is, we can find some vector $(Data1, Data2, ..., DataN)$ such that

$$SHA256(DataI)=hI,\forall I\in \{1,\ldots,N\},$$ which means that for given $hI$ we can find $DataI.$ This implies that for given $X$ we can find $DataX.$

Note: First pre-image complexity is of course much harder (square the complexity) than the collision complexity.

Note the odd fact that this argument doesn't work for a second preorder attack. Isn't crypto fun? — poncho
– poncho ♦, Commented Nov 13, 2024 at 16:15

Stack Exchange Network

Security of "hash of hashes"

2 Answers 2

Hot Network Questions

Security of "hash of hashes"

2 Answers 2

Related

Hot Network Questions