Recently I was asked, in a job interview, "how to create a zero size file [I think that means an empty file] in all the folders of the file system?" I found the question a bit strange, I thought of a loop to list all the directory and use touch or maybe go to the root directory and use touch with a recursive option. Do you have any ideas?
- Your ideas look good and are a good starting point!vanadium– vanadium2021-04-16 17:46:59 +00:00Commented Apr 16, 2021 at 17:46
- 3That question is strange in that it's asking for something completely useless. That said, it's not at all strange for an interview question in that it's asking for something completely useless.ilkkachu– ilkkachu2021-04-17 15:58:58 +00:00Commented Apr 17, 2021 at 15:58
- 1There are certainly more useful cases for this than @ikkachu thinks. For instance, a git template directory structure. Git just handles files, not directories, which just come as kind of a "side-effect" of file paths. Therefore, having at least one file in each directory is a must to have otherwise empty directories make it into a git repo. There are certainly more real-life use cases for such a requirement.Alain BECKER– Alain BECKER2021-04-18 19:50:11 +00:00Commented Apr 18, 2021 at 19:50
4 Answers
That could be ...
find . -type d -exec touch {}/emptyfile \; -type dmeans "directories"execute the commandtouchand create a file named "emptyfile"- the
{}substitutes what is found as a result fromfind. The/is to make it a valid path+filenane and the escaped ; is to close the command (otherwise it becomes "emptyfile;")
result ...
rinzwind@schijfwereld:~/t$ mkdir 1 2 3 4 5 6 7 8 9 10 rinzwind@schijfwereld:~/t$ ls -ltr */* ls: cannot access '*/*': No such file or directory rinzwind@schijfwereld:~/t$ find . -type d -exec touch {}/emptyfile \; rinzwind@schijfwereld:~/t$ ls -ltr */* -rw-rw-r-- 1 rinzwind rinzwind 0 apr 16 19:57 5/emptyfile -rw-rw-r-- 1 rinzwind rinzwind 0 apr 16 19:57 9/emptyfile -rw-rw-r-- 1 rinzwind rinzwind 0 apr 16 19:57 1/emptyfile -rw-rw-r-- 1 rinzwind rinzwind 0 apr 16 19:57 8/emptyfile -rw-rw-r-- 1 rinzwind rinzwind 0 apr 16 19:57 3/emptyfile -rw-rw-r-- 1 rinzwind rinzwind 0 apr 16 19:57 2/emptyfile -rw-rw-r-- 1 rinzwind rinzwind 0 apr 16 19:57 10/emptyfile -rw-rw-r-- 1 rinzwind rinzwind 0 apr 16 19:57 7/emptyfile -rw-rw-r-- 1 rinzwind rinzwind 0 apr 16 19:57 6/emptyfile -rw-rw-r-- 1 rinzwind rinzwind 0 apr 16 19:57 4/emptyfile Mine works but the answer from Peter Cordes is better :)
- 1I was curious if we can speed this up with
-exec ... +in GNU find. No,{}/emptyfileisn't supported with+, only;. So you'd wantfind -print0 | xargs -0to avoid starting atouchprocess for every directory in the FS (which is presumably a lot, and if you're lucky some of the IO will be contiguous and might not be a bottleneck.)Peter Cordes– Peter Cordes2021-04-17 20:49:46 +00:00Commented Apr 17, 2021 at 20:49 - (I posted an answer with
find -printf | xargs -0that only has to run touch once per ARG_MAX files created.)Peter Cordes– Peter Cordes2021-04-18 08:17:25 +00:00Commented Apr 18, 2021 at 8:17 - Awesome answer! Would you mind explaining what you mean by "the escaped ; is to close the command"? What if you leave the semicolon out?The Unknown Dev– The Unknown Dev2021-04-18 14:19:07 +00:00Commented Apr 18, 2021 at 14:19
- @TheUnknownDev: Then
findwould complain that there was no end to the;-terminated or+-terminated series of args that follow the-exec. Tryfind -exec echo '{}'in a test directory. You need to tell it whether to batch args onto one command, or run the command separately for each file, by using+or\;.Peter Cordes– Peter Cordes2021-04-18 21:26:26 +00:00Commented Apr 18, 2021 at 21:26 - 1@PeterCordes thanks :D Oh and good one on your answer. I will -not- alter mine as that would be me changing my answer to yours ;-) I assume TS will remove the accepted answer and accpets yours,Rinzwind– Rinzwind2021-04-19 09:24:56 +00:00Commented Apr 19, 2021 at 9:24
To do this efficiently, you want to avoid spawning a new touch process for every file you want to create.
This is part of what xargs is good for, batching args into chunks as large as possible, while small enough to fit on the command line of a single process. (Modern Linux has pretty huge limits so it's rarely a problem anymore, xargs rarely has to actually run your command multiple times to handle all the args, but it also lets you avoid having filename subjected to shell word-splitting like in foo $(bar).)
We can use find's own -printf to format additional stuff onto each directory path, specifically the filename we want. The xargs -0 uses a '\0' byte as a separator so this is safe with arbitrary filenames, even including newline. If you weren't using a custom printf format, you could just use -print0 to print 0-separated paths.
find . -xdev -type d -printf '%p/empty\0' | xargs -0 echo touch (in a test directory, that prints touch ./empty ./2/empty ./1/empty, not touch ./empty, touch ./1/empty, etc. so it's running one touch for multiple files.)
mktemp only accepts a single template, but if we want some randomness in the naming to reduce the chance of just touching an existing file by accident, you could do this.
find . -xdev -type d -printf "%p/empty.$RANDOM\0" | xargs -0 echo touch Note that it's the same 15-bit random number in every directory, because "$RANDOM" is expanded once by bash before find starts. You could use $(date +%s).$RANDOM or whatever you want as part of the filename.
With an SSD or tmpfs, CPU might be the bottleneck here. Or if you're lucky and metadata I/O on a magnetic disk happens to be mostly contiguous because you're touching every directory (and allocating a bunch of new inodes), even a rotational disk could maybe keep up somewhat decently. Although you're probably not touching directories in the order they're laid out on disk.
And regardless, there's no need to waste lots of CPU time starting processes for something that should be I/O limited.
Ways that don't work:
find -exec touch {} +batches args, but-exec touch {}/empty +refuses to work when{}isn't by itself.xargs -I {} echo touch {}/emptyfileimplies-L 1(only process one "line" of input for each invocation of the command, whether that's an actual line or a 0-separated string withxargs -0). So we can't use xargs to modify each arg if we want to take advantage of it for batching args.
find /mountpoint -xdev -type d -exec mktemp -p {} \; A quite obvious aspect is you may or may not need root access to actually create files under /mountpoint.
There are two non-obvious aspects:
You said "in all the folders of the file system", so we start from specific
/mountpointand do not enter other filesystems (-xdev).If there are other filesystems mounted deeper in the tree, e.g. in
/mountpoint/foo/another/mntpoint, then-xdevwill prevent us from entering them. Still these filesystems may mask entire subtrees that belong to the filesystem in question. In the best case a filesystem mounted in/mountpoint/foo/another/mntpointmasks an emptymntpointdirectory of the filesystem in question. So we cannot easily reach "all the folders of the file system".With root access we can
mount --bind /mountpoint /somewhere/elsebeforehand. With--bind(as opposed to--rbind, seeman 8 mount)mntpointdeep in/somewhere/elsewill not replicate the submount from/mountpoint/foo/another/mntpoint. This way we can accessmntpointthat belongs to the filesystem in question.This is still not enough. If the filesystem in question is Btrfs then possibly
/mountpointgives access to some subvolume but not to the entire filesystem (compare this question).In general a subtree of any(?) mounted filesystem can be bind-mounted to another directory. After you unmount the original mountpoint, the other directory gives access to a fragment of the filesystem. Our
/mountpointmay be "the other directory" in the first place and therefore it may not give access to the entire filesystem. You don't know this in advance.The conclusion is: if the phrase is strictly "all the folders of the file system" (as opposed to "all subdirectories" which is quite straightforward) then you need to make sure you don't miss any part of the filesystem. Only then use the
find …command given at the beginning of this answer.Solutions with
touch emptyfileor so do not necessarily "create a zero size file". What if the interviewer has already created a non-emptyemptyfilein one of the directories? A trap! If non-emptyemptyfileexists thentouchwill neither create it, nor the file will be empty. Effectively you will strictly fail to "create a zero size file" in the directory with the trap. This is the reason I usedmktemp. The tool will try hard to really create a new empty regular file.
This will create a unique name empty file in each directory starting with and descending from the current directory including hidden directories if you want to.
First, get the directories list with tree.
Then, pass them to xargs like so:
tree --noreport -dfi | xargs -L 1 -I {} echo touch {}/emptyfile_"$(date +%s)" Or, to a while loop like so:
tree --noreport -dfi | \ while read -r d; do echo touch "$d"/emptyfile_"$(date +%s)"; done Or, even to a for loop (if the directory names contain no spaces) like so:
for d in $(tree --noreport -dfi); do echo touch "$d"/emptyfile_"$(date +%s)"; done echois there to prevent unintentional creation of files while testing. When satisfied with the output, removeechoto create files.--noreportomits printing of the file and directory report at the end of the tree listing.-dfilists directories only, prints the full path prefix for each directory and makes tree not print the indentation lines.Use
-dfiainstead of-dfito include hidden directories as well."$(date +%s)"appends current timestamp to filename like thisemptyfile_1618679443making it unique from existing files in each directory. Notice you can change this to a random number like67639871206723if you need a fixed file name.xargs -L 1 -I {}reads the input one line at a time and assigns it to{}.
- Can't you use
find -type d -print0 | xargs -0 ...to let it pass multiple args to each invocation oftouch? Seems to me if you're going to use xargs instead offind -exec, avoiding one process startup per directory would be good.Peter Cordes– Peter Cordes2021-04-17 21:00:12 +00:00Commented Apr 17, 2021 at 21:00 - @PeterCordes That is possible like so
find -type d -print0 | xargs -0 -I {} echo touch {}/emptyfiland there may be other possible ways as well... butfindis already discussed in the other answers. I wanted to shed some light ontreewhich is a remarkable tool with great potential... Just for diversity sake. Thank you for your comment.Raffa– Raffa2021-04-17 21:12:47 +00:00Commented Apr 17, 2021 at 21:12 - xargs
-Ioptions apparently implies-L 1, unfortunately, running touch once per arg. (Apparently a 0-separated string counts as a "line"). And-I {} -L 1000 echo touch {}/emptyfiledoesn't work either, it printstouch {}/emptyfile . ./2 ./1if run in a test directory.Peter Cordes– Peter Cordes2021-04-17 21:17:57 +00:00Commented Apr 17, 2021 at 21:17 - @PeterCordes I got what you mean the first time :) ... It would be nice if it works that way... but, ARG MAX might be reached easily for the OPs purpose and touch needs some format like
touch {1/,2/,3/}emptyfileto address multiple directories and then ARG MAX awaitsRaffa– Raffa2021-04-17 21:27:10 +00:00Commented Apr 17, 2021 at 21:27 - Part of xargs's job is to break up input into arg-max chunks. But apparently only if we don't need xargs to modify each arg. Finally came up with a solution using
find -printfto add stuff to each pathname:find . -type d -printf '%p/emptyfile\0' | xargs -0 echo touchPeter Cordes– Peter Cordes2021-04-17 21:34:29 +00:00Commented Apr 17, 2021 at 21:34