Skip to content

Conversation

@TimFelixBeyer
Copy link
Contributor

@TimFelixBeyer TimFelixBeyer commented Sep 6, 2024

For use_xpos=True:
The i-th 2x2 block should be scaled by the i-th chi/scale value.
Previously, the scales were interleaved incorrectly, leading to differing results.

One way to see the issue is to take a unit vector and plot its norm after being rotated:
c4ffee4a-6017-4587-bca4-3205c927c854

Now the reference implementation and this are consistent.

@lucidrains
Copy link
Owner

@TimFelixBeyer oh shoot! yes you are correct, thank you

@lucidrains lucidrains merged commit 744432b into lucidrains:main Sep 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

2 participants