26
\$\begingroup\$

Given a sequence of bytes, output the SHA-256 hash value of the sequence.

The SHA-256 Algorithm

The following pseudocode is taken from the Wikipedia page for SHA-2.

Note 1: All variables are 32 bit unsigned integers and addition is calculated modulo 2^32 Note 2: For each round, there is one round constant k[i] and one entry in the message schedule array w[i], 0 ≤ i ≤ 63 Note 3: The compression function uses 8 working variables, a through h Note 4: Big-endian convention is used when expressing the constants in this pseudocode, and when parsing message block data from bytes to words, for example, the first word of the input message "abc" after padding is 0x61626380 Initialize hash values: (first 32 bits of the fractional parts of the square roots of the first 8 primes 2..19): h0 := 0x6a09e667 h1 := 0xbb67ae85 h2 := 0x3c6ef372 h3 := 0xa54ff53a h4 := 0x510e527f h5 := 0x9b05688c h6 := 0x1f83d9ab h7 := 0x5be0cd19 Initialize array of round constants: (first 32 bits of the fractional parts of the cube roots of the first 64 primes 2..311): k[0..63] := 0x428a2f98, 0x71374491, 0xb5c0fbcf, 0xe9b5dba5, 0x3956c25b, 0x59f111f1, 0x923f82a4, 0xab1c5ed5, 0xd807aa98, 0x12835b01, 0x243185be, 0x550c7dc3, 0x72be5d74, 0x80deb1fe, 0x9bdc06a7, 0xc19bf174, 0xe49b69c1, 0xefbe4786, 0x0fc19dc6, 0x240ca1cc, 0x2de92c6f, 0x4a7484aa, 0x5cb0a9dc, 0x76f988da, 0x983e5152, 0xa831c66d, 0xb00327c8, 0xbf597fc7, 0xc6e00bf3, 0xd5a79147, 0x06ca6351, 0x14292967, 0x27b70a85, 0x2e1b2138, 0x4d2c6dfc, 0x53380d13, 0x650a7354, 0x766a0abb, 0x81c2c92e, 0x92722c85, 0xa2bfe8a1, 0xa81a664b, 0xc24b8b70, 0xc76c51a3, 0xd192e819, 0xd6990624, 0xf40e3585, 0x106aa070, 0x19a4c116, 0x1e376c08, 0x2748774c, 0x34b0bcb5, 0x391c0cb3, 0x4ed8aa4a, 0x5b9cca4f, 0x682e6ff3, 0x748f82ee, 0x78a5636f, 0x84c87814, 0x8cc70208, 0x90befffa, 0xa4506ceb, 0xbef9a3f7, 0xc67178f2 Pre-processing: append the bit '1' to the message append k bits '0', where k is the minimum number >= 0 such that the resulting message length (modulo 512 in bits) is 448. append length of message (without the '1' bit or padding), in bits, as 64-bit big-endian integer (this will make the entire post-processed length a multiple of 512 bits) Process the message in successive 512-bit chunks: break message into 512-bit chunks for each chunk create a 64-entry message schedule array w[0..63] of 32-bit words (The initial values in w[0..63] don't matter, so many implementations zero them here) copy chunk into first 16 words w[0..15] of the message schedule array Extend the first 16 words into the remaining 48 words w[16..63] of the message schedule array: for i from 16 to 63 s0 := (w[i-15] rightrotate 7) xor (w[i-15] rightrotate 18) xor (w[i-15] rightshift 3) s1 := (w[i-2] rightrotate 17) xor (w[i-2] rightrotate 19) xor (w[i-2] rightshift 10) w[i] := w[i-16] + s0 + w[i-7] + s1 Initialize working variables to current hash value: a := h0 b := h1 c := h2 d := h3 e := h4 f := h5 g := h6 h := h7 Compression function main loop: for i from 0 to 63 S1 := (e rightrotate 6) xor (e rightrotate 11) xor (e rightrotate 25) ch := (e and f) xor ((not e) and g) temp1 := h + S1 + ch + k[i] + w[i] S0 := (a rightrotate 2) xor (a rightrotate 13) xor (a rightrotate 22) maj := (a and b) xor (a and c) xor (b and c) temp2 := S0 + maj h := g g := f f := e e := d + temp1 d := c c := b b := a a := temp1 + temp2 Add the compressed chunk to the current hash value: h0 := h0 + a h1 := h1 + b h2 := h2 + c h3 := h3 + d h4 := h4 + e h5 := h5 + f h6 := h6 + g h7 := h7 + h Produce the final hash value (big-endian): digest := hash := h0 append h1 append h2 append h3 append h4 append h5 append h6 append h7 

Reference implementation

Here is a reference implementation, in Python 3:

#!/usr/bin/env python3 import sys # ror function modified from http://stackoverflow.com/a/27229191/2508324 def ror(val, r_bits): return (val >> r_bits) | (val << (32-r_bits)) % 2**32 h = [0x6a09e667, 0xbb67ae85, 0x3c6ef372, 0xa54ff53a, 0x510e527f, 0x9b05688c, 0x1f83d9ab, 0x5be0cd19] k = [0x428a2f98, 0x71374491, 0xb5c0fbcf, 0xe9b5dba5, 0x3956c25b, 0x59f111f1, 0x923f82a4, 0xab1c5ed5, 0xd807aa98, 0x12835b01, 0x243185be, 0x550c7dc3, 0x72be5d74, 0x80deb1fe, 0x9bdc06a7, 0xc19bf174, 0xe49b69c1, 0xefbe4786, 0x0fc19dc6, 0x240ca1cc, 0x2de92c6f, 0x4a7484aa, 0x5cb0a9dc, 0x76f988da, 0x983e5152, 0xa831c66d, 0xb00327c8, 0xbf597fc7, 0xc6e00bf3, 0xd5a79147, 0x06ca6351, 0x14292967, 0x27b70a85, 0x2e1b2138, 0x4d2c6dfc, 0x53380d13, 0x650a7354, 0x766a0abb, 0x81c2c92e, 0x92722c85, 0xa2bfe8a1, 0xa81a664b, 0xc24b8b70, 0xc76c51a3, 0xd192e819, 0xd6990624, 0xf40e3585, 0x106aa070, 0x19a4c116, 0x1e376c08, 0x2748774c, 0x34b0bcb5, 0x391c0cb3, 0x4ed8aa4a, 0x5b9cca4f, 0x682e6ff3, 0x748f82ee, 0x78a5636f, 0x84c87814, 0x8cc70208, 0x90befffa, 0xa4506ceb, 0xbef9a3f7, 0xc67178f2] s = sys.stdin.read().encode() msg = [int(x,2) for c in s for x in '{:08b}'.format(c)] msg.append(1) while len(msg) % 512 != 448: msg.append(0) msg.extend([int(x,2) for x in '{:064b}'.format(len(s) * 8)]) for i in range(len(msg)//512): chunk = msg[512*i:512*(i+1)] # sloth love chunk w = [0 for _ in range(64)] for j in range(16): w[j] = int(''.join(str(x) for x in chunk[32*j:32*(j+1)]),2) for j in range(16, 64): s0 = ror(w[j-15], 7) ^ ror(w[j-15], 18) ^ (w[j-15] >> 3) s1 = ror(w[j-2], 17) ^ ror(w[j-2], 19) ^ (w[j-2] >> 10) w[j] = (w[j-16] + s0 + w[j-7] + s1) % 2**32 work = h[:] for j in range(64): S1 = ror(work[4], 6) ^ ror(work[4], 11) ^ ror(work[4], 25) ch = (work[4] & work[5]) ^ (~work[4] & work[6]) temp1 = (work[7] + S1 + ch + k[j] + w[j]) % 2**32 S0 = ror(work[0], 2) ^ ror(work[0], 13) ^ ror(work[0], 22) maj = (work[0] & work[1]) ^ (work[0] & work[2]) ^ (work[1] & work[2]) temp2 = (S0 + maj) % 2**32 work = [(temp1 + temp2) % 2**32] + work[:-1] work[4] = (work[4] + temp1) % 2**32 h = [(H+W)%2**32 for H,W in zip(h,work)] print(''.join('{:08x}'.format(H) for H in h)) 

Restrictions

  • The usage of builtins which compute SHA-256 hashes or otherwise trivialize the challenge is forbidden
  • The input and output may be in any reasonable format (encoded in a single-byte encoding, base64, hexadecimal, etc.)

Test cases

<empty string> -> e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 abc -> ba7816bf8f01cfea414140de5dae2223b00361a396177a9cb410ff61f20015ad Hello, World! -> c98c24b677eff44860afea6f493bbaec5bb1c4cbb209c6fc2bbb47f66ff2ad31 1234 -> 03ac674216f3e15c761ee1a5e255f067953623c8b388b4459e13f978d7c846f4 
\$\endgroup\$

5 Answers 5

9
+200
\$\begingroup\$

J, 458 445 443 438 435 430 421 bytes

3 :0 B=.32#2 A=.B&#: P=.+&#. 'H K'=.A<.32*&2(-<.)2 3%:/p:i.64 for_m._512]\(,(1{.~512|448-#),(,~B)#:#)y do.w=.(,B#:(15&|.~:13&|.~:_10|.!.0])@(_2&{)P/@,(25&|.~:14&|.~:_3|.!.0])@(_15&{),_7 _16&{)^:48]_32]\m 'a b c d e f g h'=.H=.8{.H for_t.i.64 do.u=.A]P/((e<g)~:e*f),h,(~:/26 21 7|."{e),t{&>K;w v=.A(a*b)P(c*a~:b)P~:/30 19 10|."{a h=.g g=.f f=.e e=.A]d P u d=.c c=.b b=.a a=.A]u P v end. H=.A]H P a,b,c,d,e,f,g,:h end. ,H ) 

Try it online!

This is a monadic verb that takes a list of bits as input and outputs a list of bits. On TIO, conversion from string to list of bits for the input and list of bits to hexadecimal is implemented for convenience. To test other input, just modify the text in the input field.

\$\endgroup\$
5
  • \$\begingroup\$ This is beautiful \$\endgroup\$ Commented Jan 7, 2017 at 12:12
  • \$\begingroup\$ @Mego Thanks, there's still some redundant parts that could probably be golfed away to maybe save 10 or so bytes. \$\endgroup\$ Commented Jan 7, 2017 at 12:20
  • \$\begingroup\$ I'd love to see a tacit version :P \$\endgroup\$ Commented Jan 10, 2017 at 13:00
  • \$\begingroup\$ If only 13 supported multi-line definitions and assignments also. \$\endgroup\$ Commented Jan 10, 2017 at 15:37
  • \$\begingroup\$ As of J 8.06, there is now a builtin for using SHA-256 and other hashes, 128!:6. Example \$\endgroup\$ Commented Aug 23, 2018 at 10:55
9
\$\begingroup\$

Python 2, 633 bytes

n=range f=2**32 q=512 r=lambda v,b:v%f>>b|(v<<32-b)%f t=int g=lambda e:[t(x**e%1*f)for x in n(2,312)if 383**~-x%x==1] h=g(.5) k=g(1/3.) m=map(t,input()) l=len(m) m+=[1]+[0]*((447-l)%q)+map(t,'{:064b}'.format(l)) for i in n(l/q+1): c=m[q*i:][:q];w=[t(`c[j*32:][:32]`[1::3],2) for j in n(16)];x=h[:8] for j in n(48):a,o=w[j+1],w[j+14];w+=[(w[j]+(r(a,7)^r(a,18)^(a>>3))+w[j+9]+(r(o,17)^r(o,19)^(o>>10)))%f] for j in n(64):a,o=x[::4];d=x[7]+(r(o,6)^r(o,11)^r(o,25))+(o&x[5]^~o&x[6])+k[j]+w[j];e=(r(a,2)^r(a,13)^r(a,22))+(x[1]&a|x[2]&a|x[1]&x[2]);x=[d+e]+x[:7];x[4]+=d h=[(H+W)%f for H,W in zip(h,x)] print''.join('%08x'%H for H in h) 

This solution is the result of a collaboration between myself, Leaky Nun, and Mars Ultor. As such, I've made it community wiki out of fairness. It takes input as a binary string wrapped in quotes (e.g. '011000010110001001100011' for abc) and outputs a hex string.

\$\endgroup\$
1
  • 2
    \$\begingroup\$ At least it's clear how it works? :) \$\endgroup\$ Commented Jun 3, 2016 at 22:35
9
\$\begingroup\$

Python 2, 519 bytes

Q=2**32 G=lambda e:[int(x**e%1*Q)for x in range(2,312)if 383**~-x%x<2] H=G(.5)[:8] r=lambda v,b:v>>b|v<<32-b M=input() l=len(M) M+=bin(l|1<<(447-l)%512+64)[2:] while M:j=0;a,b,c,d,e,f,g,h=H;exec"H+=int(M[:32],2),;M=M[32:];"*16+"x=H[-15];y=H[-2];H+=(H[-16]+H[-7]+(r(y,17)^r(y,19)^y>>10)+(r(x,7)^r(x,18)^x/8))%Q,;"*48+"u=(r(e,6)^r(e,11)^r(e,25))+(e&f^~e&g)+h+G(1/3.)[j]+H[j+8];X=a,b,c,d,e,f,g,h=(u+(r(a,2)^r(a,13)^r(a,22))+(a&b^a&c^b&c))%Q,a,b,c,(d+u)%Q,e,f,g;j+=1;"*64;H=tuple(a+b&Q-1for a,b in zip(H,X)) print"%08x"*8%H 

I was working off the pseudocode, but some parts just ended up being the same as the golfed reference Mego posted since there's not much to golf (e.g. the constant tables, for which the only real golf was a <2 instead of ==1). Over 100 bytes down though, but I'm sure there's still more to be gotten.

Input/output is also string of bits to hex string.

\$\endgroup\$
2
  • \$\begingroup\$ I think you can alias int, which would save 2B. I'm not that familiar with golfing in Python, so there might still be a few more bytes you can save. Also, what does the x**e%1*Q do? I ran some tests with some random values for x and e, but it always returned 0... \$\endgroup\$ Commented Jan 7, 2017 at 18:28
  • \$\begingroup\$ @L.Serné int is only used twice, so aliasing wouldn't save anything. x**e%1 gives fractional part, so you'd have to test with fractional e for the intended effect. \$\endgroup\$ Commented Jan 8, 2017 at 9:11
7
\$\begingroup\$

C (clang) -lm, 1913 ... 839 bytes

#define q unsigned l #define I x-> #define D(c)I b[1]+=*I b>1<<32-c,*I b+=c #define R(a,b)(a>>b|a<<32-b) #define S(x,a,b)(R(x,a)^R(x,b)^x>> #define G(b);for(o=0;o<b;o++) #define V(a,b){q=0;for(n=2;k=l<=o;k/n?r=n,l++:0,n++)while(n%++k);b(1L<<32)*a(r);} #define B bzero(I d typedef struct{char d[64];q,b[2],s[8];}X;q,k,n,r,o,a[8],t,m[64];A(X*x){bcopy(I s,a,32)G(64){t=a[4];t=a[7]+(R(t,6)^t^S(t,11,25)0))+(t&a[5]^~t&a[6])+(m[o]=o<16?htonl(o[(int*)I d]):S(m[o-2],17,19)10)+m[o-7]+S(m[o-15],7,18)3)+m[o-16]),l=R(*a,2)^*a^S(*a,13,22)0),l+=*a&a[1]^(*a^a[1])&a[2],bcopy(a,a+1,28);V(cbrt,a[4]+=t+=)*a=t+l;}G(8)I s[o]+=a[o];}i(X*x){B,88)G(8)V(sqrt,I s[o]=)}p(X*x,char*w,q){G(l)I d[I l]=w[o],++I l>63?A(x),D(32),I l=0:0;}f(X*x,*h){I d[o=I l]=128;B+ ++o,56+o/57*8-o);o>56?A(x),B,56):0;D(I l*8)G(8)I d[63-o]=I b[o/4]>>o%4*8;A(x)G(8)o[h]=htonl(I s[o]);} 

Try it online!

Thanks ceilingcat for a lot of improvements, I didn't expect to get under 1Kb.

Old version - before ceilingcat's improvements

C, 1913 1822 bytes (just for fun)

#define q unsigned #define D(a,c)x->b[1]+=a>1<<33-1-c;a+=c; #define R(a,b)(a>>b|a<<32-b) #define S(x,a,b,c)(R(x,a)^R(x,b)^x>>c) #define W(i,a)i=x->s[a]; #define Y(i,a)x->s[a]+=i; #define Z(_,a)h[i+a*4]=x->s[a]>>(24-i*8); #define J(a)x->d[a] #define T(_,a)x->d[63-a]=x->b[a/4]>>8*(a%4); #define Q(_,a)x->s[a]=v[a]; #define A(F)F(a,0)F(b,1)F(c,2)F(d,3)F(e,4)F(f,5)F(g,6)F(h,7) #define G(a,b)for(i=a;i<b;++i) typedef struct{q char d[64];q l,b[2],s[8];}X;q k[]={0x428a2f98,0x71374491,0xb5c0fbcf,0xe9b5dba5,0x3956c25b,0x59f111f1,0x923f82a4,0xab1c5ed5,0xd807aa98,0x12835b01,0x243185be,0x550c7dc3,0x72be5d74,0x80deb1fe,0x9bdc06a7,0xc19bf174,0xe49b69c1,0xefbe4786,0x0fc19dc6,0x240ca1cc,0x2de92c6f,0x4a7484aa,0x5cb0a9dc,0x76f988da,0x983e5152,0xa831c66d,0xb00327c8,0xbf597fc7,0xc6e00bf3,0xd5a79147,0x06ca6351,0x14292967,0x27b70a85,0x2e1b2138,0x4d2c6dfc,0x53380d13,0x650a7354,0x766a0abb,0x81c2c92e,0x92722c85,0xa2bfe8a1,0xa81a664b,0xc24b8b70,0xc76c51a3,0xd192e819,0xd6990624,0xf40e3585,0x106aa070,0x19a4c116,0x1e376c08,0x2748774c,0x34b0bcb5,0x391c0cb3,0x4ed8aa4a,0x5b9cca4f,0x682e6ff3,0x748f82ee,0x78a5636f,0x84c87814,0x8cc70208,0x90befffa,0xa4506ceb,0xbef9a3f7,0xc67178f2},v[]={0x6a09e667,0xbb67ae85,0x3c6ef372,0xa54ff53a,0x510e527f,0x9b05688c,0x1f83d9ab,0x5be0cd19};a(X*x){q a,b,c,d,e,f,g,h,i,t,z,m[64];G(z=0,16)m[i]=J(z++)<<24|J(z++)<<16|J(z++)<<8|J(z++);G(i,64)m[i]=S(m[i-2],17,19,10)+m[i-7]+S(m[i-15],7,18,3)+m[i-16];A(W);G(0,64){t=h+(R(e,6)^R(e,11)^R(e,25))+(e&f^~e&g)+k[i]+m[i];z=(R(a,2)^R(a,13)^R(a,22))+(a&b^a&c^b&c);h=g;g=f;f=e;e=d+t;d=c;c=b;b=a;a=t+z;}A(Y)}i(X*x){x->l=*x->b=x->b[1]=0;A(Q)}p(X*x,char*w,q l){q t,i;G(0,l){J(x->l)=w[i];if(++x->l==64){a(x);D(*x->b,512)x->l=0;}}}f(X*x,char*h){q i=x->l;if(i<56){J(i++)=128;G(i,56)J(i)=0;}else{J(i++)=128;G(i,64)J(i)=0;a(x);G(0,56)J(i)=0;}D(*x->b,x->l*8)A(T)a(x);G(0,4){A(Z)}} 

I took the reference implementation and started golfing, my target was under 2k.

Can be improved if somebody knows how to generate the constants (cube root of primes, I can't think of any golf-friendly way).

Usage:

X ctx; unsigned char hash[32]; i(&ctx); // initialize context p(&ctx,text,strlen(text)); // hash string f(&ctx,hash); // get hash 
\$\endgroup\$
2
  • \$\begingroup\$ Golfed constant generator (still golfable though). Try it online! \$\endgroup\$ Commented Aug 23, 2018 at 14:43
  • \$\begingroup\$ Suggest (*R)()=L"\xd3ce8797쏈"; instead of #define R(a,b)(a>>b|a<<32-b). You'll need to add -Wl,-z,execstack to the compiler flags \$\endgroup\$ Commented Nov 17 at 5:56
5
\$\begingroup\$

Python 3.8+, 481 bytes

Q=2**32-1;_=Q+2;G=lambda e:[int(x**e%1*-~Q)for x in range(2,312)if 2**x%x==2%x];H=G(.5)[:8];l=len(M:=input());M+=f"{l|1<<(447-l)%512+64:b}" while M:a,b,c,d,e,f,g,h=H;exec("H+=int(M[:32],2),;M=M[32:]\n"*16+"for k in G(1/3):x,y=H[9::13];H+=H[17]+H[8]+(y*_>>17^y*_>>19^y>>10)+(x*_>>7^x*_>>18^x>>3)&Q,;u=(e*_>>6^e*_>>11^e*_>>25)+(e&f^~e&g)+h+k+H.pop(8);Y=a,b,c,d,e,f,g,h=u+(a*_>>2^a*_>>13^a*_>>22)+(a&b^a&c^b&c)&Q,a,b,c,d+u&Q,e,f,g");H=[Q&sum(p)for p in zip(H,Y)] print('%08x'*8%(*H,)) 

Takes binary string input, outputs hex.

Key optimizations:

  • Q=2**32-1: (Credit to @emanresu A) Replacing T=2**32 with the mask Q allows us to perform modular arithmetic using bitwise AND (&Q). Since & has lower precedence than +, we can remove the parentheses required by %T, saving multiple bytes in the exec loop.
  • _=Q+2: This reconstructs the rotation multiplier (2**32+1) relative to Q.
  • 2**x%x==2%x: A golfed Base-2 Fermat primality test used to generate constants algorithmically.
  • f"{...:b}": Replaces bin(...)[2:] for a shorter padding construction.
  • exec Loop Injection: Instead of multiplying the compression string by 64, a python for loop is injected directly into the exec string.
  • x,y=H[9::13]: (Credit to @emanresu A) Assigns the temporary variables using a list slice with a step.
  • H.pop(8): Uses the H list as a sliding window/queue for the message schedule.

Try it online!

\$\endgroup\$
9
  • \$\begingroup\$ You should be able to omit most of the %Ts in the assignments for Y and H in the loop, since you modulo them by T when combining H and Y. Also, I'm not sure you need the last a,b,c,d,e,f,g,h= at all, and you might want to add a TIO link so others can test your code. \$\endgroup\$ Commented Nov 19 at 10:43
  • \$\begingroup\$ @emanresu-a Ran your "optimizations" through the harness. They break the bitwise logic. 1. %T Removal: The *_>> rotation trick is brittle. It relies on inputs being clean 32-bit integers. Without the modulo, accumulator overflow bleeds into the high bits, corrupting the multiplication-based shift. 2. Y Assignment: That's not dead code; it captures the final register state (a through h) for the Davies-Meyer feed-forward (zip(H,Y)). You can't skip it without breaking the hash chain. Baseline holds at 486. \$\endgroup\$ Commented Nov 19 at 15:43
  • \$\begingroup\$ Ah, I missed how the loop in the exec works. In any case, you can save two bytes by replacing x=H[9];y=H[22] with x,y=H[9::13] (since the buffer length is limited, this won't error as far as I can tell) \$\endgroup\$ Commented Nov 19 at 18:24
  • \$\begingroup\$ And replacing 1<<(447-l)%512+64 with T*T<<(447-l)%512 saves another byte \$\endgroup\$ Commented Nov 19 at 18:42
  • \$\begingroup\$ @emanresu-a Brilliant catch on the slice and the padding. H[9::13] works perfectly within the buffer limits, and using T*T (aka 2**64) to handle the +64 offset is very clever. Implemented both for a drop to 483. Thanks! \$\endgroup\$ Commented Nov 19 at 19:32