5
$\begingroup$

I was reading some guy's description of the SHA-256 algorithm and I noticed that the basic operations it uses appear trivial to reverse: addition, rotate bits, etc. When I say "reverse" I don't mean retrieving the exact input used to generate the hash, I mean finding some input which generates the same hash (finding a collision). I would like to understand why this is actually difficult.

Preimage resistance: For a given h in the output space of the hash function, it is hard to find any message x with H(x)=h. Source

Here is an general explanation for why a function can be easy to calculate in one way, but very difficult to reverse (calculate the other way). What in particular gives SHA-256 this property? If someone tried to generate a hash collision, which obstacles would they face?

SHA-256 diagram from Wikimedia commons

The initial values for A-H are predetermined in the algorithm. The actual input comes from W. In this sense the "initial state" can be thought of as a combination of user input and predetermined values. Does this contribute to the difficulty in finding collisions? If the initial values for A-H could be any arbitrary values, would it be trivial to find hash collisions?

$\endgroup$
2
  • 4
    $\begingroup$ $A \oplus B = C$. Say $C = 1$, what is the value of $A$ and $B$? Now you can find both combinations, but for much larger circuits executed multiple times (multiple rounds, in other terms) this gets much harder. $\endgroup$ Commented Aug 7, 2016 at 21:22
  • 1
    $\begingroup$ possibly related/interesting question/answers $\endgroup$ Commented Aug 8, 2016 at 2:25

1 Answer 1

8
$\begingroup$

You appear to be thinking "the output of the round function appears to be a simple function of the previous round, and a word from the message; why can't we select message words to steer the output to whatever we wanted".

Well, there's something missing in the simple description you gave; message expansion. The W inputs are not just directly taken from the input message; the first 16 are, but the other 48 are linear functions of the input message. If you modify one input word, you necessarily modify a number of the W inputs; this effectively stops this simple steering strategy, as tweaking one W input (to get a desired state) will necessarily modify others (which will modify the state you're trying to set up).

This doesn't prove the SHA-256 is secure (SHA-1 does the same sort of message expansion, and we know that's weak); however it does show that this approach doesn't appear to be fruitful.

$\endgroup$
1
  • 1
    $\begingroup$ Note: The message expansion function in SHA2 family, is not linear but is a non-linear transformation, which is reason why same attacks which work against SHA1 dont work for SHA2. $\endgroup$ Commented Jan 22, 2019 at 12:56

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.