When copying large files (1-2 GB per file) between file systems, file fragmentation can happen if the destination file system is nearly full.
Our C++ application code uses fallocate() to pre-allocate space when creating and writing data files but I'm wondering how the linux copy command /bin/cp handles that.
Does cp just copy bytes or chunks of data in a loop (and let the file system deal with it)? Or does cp first call fallocate() or posix_fallocate() with the size of the source file?
I haven't found anything on this subject searching the internet.
The filesystem could be ext3, ext4, or xfs.
Centos 8.1, kernel 4.18.0-147.el8.x86_64 #1 SMP
EDIT I
As background, the actual application reads a constant bit rate network stream and pre-allocates a file for N seconds of content. If the actual bitrate is higher, the file naturally grows. ftruncate() is called when the file is closed, which handles if the actual bitrate is lower. cp is only used to move those files between filesystems, hence my question.
And the reasoning for that is to avoid fragmentation. Without fallocate the file system will become increasingly fragmented over time. (Of course fallocate() doesn't completely prevent the problem but certainly mitigates it)
According to Uninitialized blocks and unexpected flags, fallocate() results in "efficient" allocation of contiguous blocks (for most filesystems):
The fallocate() system call is meant to be a way for an application to request the efficient allocation of blocks for a file. Use of fallocate() allows a process to verify that the required disk space is available, helps the filesystem to allocate all of the space in a single, contiguous group, and avoids the overhead that block-by-block allocation would incur.
So I was wondering if copying a large, heavily fragmented file ends up contiguous or fragmented at the destination. Since cp doesn't use fallocate() to pre-allocate space then answer appears to be "possibly yes".
cpdoesn't do preallocation, it doesn't have to deal with that situation - it just copies from the input file until there's no more data.ftruncate()is called when the file is closed, which handles if the actual bitrate is lower.cpis only used to move those files between filesystems, hence my question.cphas been answered and I have offered you a working alternative withdd. You didn't react to any of the two answers, and only added the comment above. So what else are you expecting from us exactly?