- Notifications
You must be signed in to change notification settings - Fork 477
Description
Primer
Is &[u8] a byte slice or an 8-bit unsigned integer slice? what about Vec<8>?
In Rust/ink!, there isn't any semantic difference.
This can be seen in various <from/to/into/as>_bytes conversions (including from the standard library) that either take or return some u8 sequence/collection (e.g. String::as_bytes).
NOTE: Newtype wrappers are commonly used to add higher-level semantics to sequences/collections of bytes/u8s, but fundamentally, these are zero-cost abstractions that are cheap/free to convert back into the underlying sequences/collections (e.g. struct Bytes(pub Vec<u8>)).
However, unlike Rust/ink!, Solidity has primitives for both 8-bit unsigned integers (i.e. uint8), and bytes sequences (i.e. bytes and bytes1, bytes2 ... bytes32).
As such, there are meaningful differences between how bytes (and bytes<N>) sequences, and uint8 sequences (i.e. fixed-size and dynamic uint8 arrays) are represented/encoded in calldata (i.e. Solidity ABI encoding), but also in memory.
One meaningful difference for us is that in Solidity ABI encoding, bytes (i.e. dynamic) and bytes<N> (i.e. fixed-sized) arrays are packed, while uint8[] (i.e. dynamic) and uint8[N] (i.e. fixed-sized) are not.
As a concrete example, bytes32 is encoded/packed into a 32 byte sequence (i.e. a single word) in Solidity calldata (and memory), while uint8[32] is encoded into a 1024 byte sequence (i.e. 32 words, with 32 bytes used for each element).
NOTE: abi.encodePacked() is not part of the Solidity ABI spec, so it's not a transparent optimization (i.e. interacting contracts have to be aware of it usage).
Takeaways
- For Rust/ink!, byte and 8-bit unsigned integer sequences are semantically equivalent
- Representation/encoding differences only matter at the Solidity ABI/interoperability boundary
Goals
- Because byte and 8-bit unsigned integer sequences are semantically equivalent in Rust/ink!, any abstractions over them should be zero-cost (i.e. it should be cheap/free to convert between any two semantically equivalent representations)
- It should be possible for ink! smart contract authors to keep abstracts at the interface boundary (i.e. only deal with close to the signature of an ink! message and mostly ignore them body)
Design
The default mappings will be as follows:
u8is mapped touint8u8sequences/collections are mapped to equivalent Solidity fixed-size arrays (i.e.uint8[N]whereNis the array size) and dynamic arrays (i.e.uint8[])
We then introduce a newtype wrapper AsBytes<T> that:
- Can only be applied to
u8sequences/collections with equivalent Solidity byte types (enforced by a sealed trait bound on T) - Encapsulates logic for encoding/decoding
u8sequences/collections as their Solidity bytes types (e.g.AsBytes<[u8; 32]>is mapped tobytes32) - Implements core/standard traits for cheaply/freely using/passing the wrapper type in place of the underlying type (e.g.
Deref,Borrow,AsRefe.t.c)
This is roughly translates to:
ink! <=> Solidity AsBytes<[u8; 1]> == bytes1 AsBytes<[u8; 2]> == bytes2 ... AsBytes<[u8; 32]> == bytes32 AsBytes<Vec<u8>> == bytes This allows ink! developers to largely keep the AsBytes<T> wrapper at the interface boundary (e.g. an ink! message with AsBytes<[u8; 32]> input and output types can immediately deref the input to [u8; 32] and deal with only the underlying type in the function body, and then cheaply wrap it back up to AsBytes<[u8;32]> at the return place).
Follow ups/Updates
Alternatives
The following alternatives were considered and rejected/abandoned:
- Introducing a
Bytetype as a semantically distinctu8equivalent was abandoned because conversions fromBytesequences/collections tou8equivalents would be expensive - Mapping
u8touint8, andu8sequences/collections tobytesandbytesNequivalents by default, and then providing two wrappers (e.g.AsBytes<T>andAsInt<T>) to override the preferred representation/encoding was abandoned due to:- Perceived higher user-level cognitive load (e.g.
u8would map touint8by default, while[u8; N]for1 <= N <= 32would map tobytes<N>by default, and[u8; N]forN > 32would always map touint8[N]) - Generic implementation complexity due to Rust's limited support for specialization
- Perceived higher user-level cognitive load (e.g.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status