Sometimes, I wish to only get the directory structure of a website, but the files themselves are not important. I only want their name. Sort of like a mirror where every entry is just an empty dummy file.
Of course, doing a wget -r and afterwards run a script to empty all the files works fine, but it feels wasteful because it is not nice to neither the server nor my bandwidth. A more efficient, but even less elegant way is to manually stop and restart the process every time you hit a large file, or set a very short time-out. At least that significantly reduces the amount of data I have to download.
My question is: Can I make wget only create a file, but not download its content? Or am I using the wrong tool for the job?
--spideroption. For example:wget -r -nv --spider http://example.com, then parse the output.example.htmllinks to without downloading it first. There is no such thing as a "ls -Rover HTTP", spidering is your best option. And I believe you do save some bandwidth with--spider, f.i. I don't think image files and the like are downloaded.