3

I'm looking for a way to take a large object and break it into smaller mutable child objects, which can be processed in parallel.

Something like:

struct PixelBuffer { data:Vec<u32>, width:u32, height:u32 } struct PixelBlock { data:Vec<u32> } impl PixelBuffer { fn decompose(&'a mut self) -> Vec<Guard<'a, PixelBlock>>> { ... } } 

Where the resulting PixelBlock's can be processed in parallel, and the parent PixelBuffer will remain locked until all Guard<PixelBlock> are dropped.

This is effectively mutable pointer aliasing; the large data block in PixelBuffer will be directly modified via each PixelBlock.

However, each PixelBlock is non-overlapping segment from the internal data in PixelBuffer.

You can certainly do this in unsafe code (internal buffer is a raw pointer; generate a new external pointer for each PixelBlock); but is it possible to achieve the same result using safe code?

(NB. I'm open to using a data block allocated from libc::malloc if that'll help?)

5
  • 1
    If the memory is non-overlapping, is it really aliasing? As I understand it, aliasing means two different names referring to the same thing, like two variables pointing to the same memory. Commented Oct 21, 2015 at 3:21
  • @Shepmaster I'm not sure. Technically there's a pointer to the start of the data block, and a pointer to the sub-section of it. That's ... technically... aliasing? Maybe? Commented Oct 21, 2015 at 4:07
  • For example, if I have a Vec<Foo>, it's aliasing if I write to both the first and 3rd items in it at the same time, even though they don't overlap in memory, right? Commented Oct 21, 2015 at 4:15
  • 1
    No, it's definitely not aliasing. As I said in my answer, the mutable iterators already do all of this. Commented Oct 21, 2015 at 5:00
  • Wikipedia says: "modifying the data through one name implicitly modifies the values associated with all aliased names" Commented Oct 21, 2015 at 5:00

2 Answers 2

3

This works fine and is a natural consequence of how, e.g., iterators work: the next method hands out a sequence of values that are not lifetime-connected to the reference they come from, i.e. fn next(&mut self) -> Option<Self::Item>. This automatically means that any iterator that yields &mut pointers (like, slice.iter_mut()) is yielding pointers to non-overlapping memory, because anything else would be incorrect.

One way to use this in parallel is something like my simple_parallel library, e.g. Pool::for_.

(You'll need to give more details about the internals of PixelBuffer to be more specific about how to do it in this case.)

Sign up to request clarification or add additional context in comments.

3 Comments

Technically though, how would you safely hand the child items out though? For example, take the trivial example of the internal data being a Vec<u8>; you can't safely take two mutable slices from this right? The reason Vec can do it internally is that it uses an unsafe pointer internally, as I understand it?
You just need to do something that allows the compiler to understand that the pointers are disjoint. At the lowest level this involves unsafe, but you can also build on existing functionality, like split_at_mut and iterators over mutable references.
@Doug it's exposed through split_at_mut in the library. Sure, it's not implemented with language primitives (instead with a block of unsafe marked code), but theoretically it could be.
3

There is no way to completely avoid unsafe Rust, because the compiler cannot currently evaluate the safety of sub-slices. However, the standard library contains code that provides a safe wrapper that you can use.

Read up on std::slice::Chunks and std::slice::ChunksMut.

Sample code: https://play.rust-lang.org/?gist=ceec5be3e1530c0a6d3b&version=stable

However, your next problem is sending the slices to separate threads, because the best way to do that would be thread::scoped, which is currently deprecated due to some safety problems that were discovered this year...

Also, keep in mind that Vec<_> owns its contents, whereas slices are just a view. Generally, you want to write most functions in terms of slices, and keep only one "Vec" to hold the data.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.