1
$\begingroup$

I have multiple .tif files which range from 500MB - 5GB. I need to convert them to zarr arrays and preferably write them to my disk. I have an aws ec2 linux instance with 32GB RAM. I have searched alot online but haven't found anyway to do so. I was looking into a library called pyvips but am not able to use it to convert the image to a zarr array. I also thought file = tifffile.imread(path_to_tif, aszarr=True) would do the trick but that didn't work. Any help is appreciated!

$\endgroup$
3
  • $\begingroup$ Maybe you can split the processing up into reading a fixed size buffer from input file (into a numpy array) and then do the second step of writing to the zarr arrays. zarr.readthedocs.io/en/stable/… $\endgroup$ Commented Apr 15, 2024 at 9:35
  • $\begingroup$ I'm sorry I don't quite understand what you're trying to say. If I read the image in 'buffers' say using pyvips, having to convert to numpy arrays is what takes up all the RAM..even if I do it in stages ultimately it adds up $\endgroup$ Commented Apr 15, 2024 at 9:59
  • $\begingroup$ How did these files come about to begin with? $\endgroup$ Commented Apr 16, 2024 at 0:17

1 Answer 1

0
$\begingroup$

I found a solution which I'm linking here in case! https://gist.github.com/GenevieveBuckley/d94351adcc61cb5237a6c0a540c14cf6

$\endgroup$

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.