TL;DR! What's the tool I should use instead of dd?: It's cp, if your GNU coreutils is recent enough.
Actual file systems to the rescue
This is not FS-independent and OS-independent, at all! Allocation of blocks to files is a specific of specific file systems, in which they differ. So, it cannot strictly be independent!
Ext4's delayed allocation to the rescue
Ext4 supports allocate-on-flush: Blocks are allocated to files as the data is flushed to disk, not earlier. This means that if your system (and a modern Fedora most definitely does that) uses file system buffers, the allocation sizes of files that are being sequentially extended are very large, so the fragmentation would be very low. You can activate that, and have to do nothing afterwards, but simply use cp or dd (I'd prefer cp, as that not only solves your first, but also your second question!).
In ext4, that's called delayed allocation; just add the delalloc mount option! (See man ext4)
Using cp will (on a modern Fedora) use a copy_file_range call, which then leads to a contiguous allocation when things are flushed to disk.
XFS to the rescue
XFS does delayed allocation by default; see ext4.
same file system: reflink
When you use cp from Fedora >=34, or a GNU coreutils >=9.0 cp, it has support for reflinks, i.e. simply not copying data, but instead simply marking blocks as used by two files, and making only a copy if one changes. That's pretty nice feature, but of course only works if the source and target file are on the same file system.
The effect is that the target file is exactly as (un-)fragmented as the source file, because it is literally the same blocks.
different file system: XFS allocation groups
XFS doesn't manage the free and used space as "one thing" for the whole file system, it has multiple allocation groups. To cite man xfs:
The data section contains all the filesystem metadata (inodes, directories, indirect blocks) as well as the user file data for ordinary (non-realtime) files and the log area if the log is internal to the data section. The data section is divided into a number of allocation groups. The number and size of the allocation groups are chosen by mkfs.xfs(8) so that there is normally a small number of equal-sized groups. The number of allocation groups controls the amount of parallelism available in file and block allocation. It should be increased from the default if there is sufficient memory and a lot of allocation activity. The number of allocation groups should not be set very high, since this can cause large amounts of CPU time to be used by the filesystem, especially when the filesystem is nearly full. More allocation groups are added (of the original size) when xfs_growfs(8) is run.
So, to solve the "concurrent allocation leads to fragmentation" problem, you simply need enough allocation groups! I'd expect a reasonably sized file system to have a couple, but simply increasing the count from the default (I use 5 for a 4-device striped LVM volume, so that seems to perform well) would work. You'd need to reformat or add more storage, however, to be able to increase the number of allocations groups.
Tools to the rescue
dd isn't the tool of choice for overwriting a file with another: cp is; it will, in its current version (9.0) and the version shipped with your Fedora (8.32 with fedora patches) use copy_file_range, which tells the underlying file system how much data will be copied in the end – so that allocation simply works on block.