If you take the quarter round function from ChaCha/BLAKE (128 bit permutation with 32-bit words or 256 bit permutation with 64-bit words; different rotation constants)
Would you have a sound 128/256 bit keyed permutation/block cipher (using BLAKE1/2/3's msg injection approach as the key schedule)?
Something faintly similar is done with Serpent:
This step is just for diffusion. But they both have the same 4-word state.
Something I was never clear on is why Serpent went through so much trouble to linearly mix the 4 words when it had S-boxes that already did it, linear diffusion could have been restricted to just inside the words (similar to how Ascon does it - https://ascon.isec.tugraz.at/specification.html).

