2

I want to download all files named ${n}x${n} from a directory on a website with wget2 on zsh, where n is the same number value both times, with n from 1 to 6000.

I've found that specifying all the download URLs as arguments to a single call to wget2 works much faster than a for loop calling wget2 once per URLs.

I assume I could use brace expansion to populate an array, then modify the array elements to produce the correct 6,000 URLs with the number repeated, then pass the array as the arguments to wget2, but I'd prefer a more concise methodology, if possible.

Maybe a one-liner with brace expansion could work. I don't know, however, how to repeat a brace expansion value.

e.g.:

wget2 -q https://example.com/{1..6000}x<repeated-brace-expansion-value> # like: # wget2 -q https://example.com/1x1 # wget2 -q https://example.com/2x2 # … 

How can I repeat a brace expansion value? Is there some other way to do this efficiently without brace expansions or without having to go through all the steps of creating an array?

2 Answers 2

3

Create a function which accepts <brace-expansion-value> as an argument. Then your function can re-use that argument as many times as you like:

get_one() { # $1 is a number 1..6000 wget2 -q https://example.com/"$1"x"$1" } 

Now call that function once for each integer you want to iterate over.

Another alternative would be to avoid using brace expansion and construct your wget directly, using jot:

jot -w 'wget2 -q https://example.com/%dx%1$d' 6000 | ( head -5; echo ...; tail -5 ) wget2 -q https://example.com/1x1 wget2 -q https://example.com/2x2 wget2 -q https://example.com/3x3 wget2 -q https://example.com/4x4 wget2 -q https://example.com/5x5 ... wget2 -q https://example.com/5996x5996 wget2 -q https://example.com/5997x5997 wget2 -q https://example.com/5998x5998 wget2 -q https://example.com/5999x5999 wget2 -q https://example.com/6000x6000 

Then pipe that to a shell.

If multiple calls to wget2 isn't your bag, then use jot solely for creating the URLs and pipe the URLs to xargs(1):

jot -w 'https://example.com/%dx%1$d' 6000 | xargs wget2 -q 

And indeed, as @muru points out, you could reduce that to:

jot -w 'https://example.com/%dx%1$d' 6000 | wget2 -qi- 
5
  • Calling wget2 once per value is much slower than calling it only once with multiple URLs. To call with multiple URLs, I could just build an array, then use it as the arguments for one call to wget2, but I want to see if there's any more concise way to do it. Commented Aug 26, 2024 at 22:26
  • That wasn't part of the original question, though. Commented Aug 26, 2024 at 22:32
  • @XDR that might work for 6000 URLs, if they're as short as your example shows. It's possible though that too many URLs that add up to a large enough combined length may trigger a "Too many arguments" or other sort of command buffer overrun. In which case xargs would be your friend. Commented Aug 26, 2024 at 23:10
  • @tink, it was, just phrased differently Commented Aug 26, 2024 at 23:34
  • You can just ask wget2 to read the URLs from stdin instead of using xargs to then run wget2: ... | wget2 -qi - Commented Aug 27, 2024 at 2:44
3

You can't use just brace expansion for that.

One approach would be to do the equivalent of perl's:

exec "wget2", map {"https://example.com/${_}x$_"} 1..6000 

In zsh, to do that mapping, there are a few options

set -o extendedglob (){wget ${@/(#m)*/https:\/\/example.com\/${MATCH}x$MATCH}} {1..6000} 

Or the equivalent with the csh-style :s modifier which avoids having to escape the / (but is currently quite buggy when quoting is involved):

set -o extendedglob -o histsubstpattern (){wget $@:s@(#m)*@https://example.com/${MATCH}x$MATCH} {1..6000} 

Or:

urls=() printf -v urls 'https://example.com/%1$sx%1$s' {1..6000} wget2 $urls 

Though building that same list with a loop is about as short:

urls=() for ((i=1;i<=6000;i++)) urls+=https://example.com/${i}x${i} wget2 $urls 

(even shorter with for i ({1..6000}) though that's less flexible and means building the whole list of numbers in memory and then iterate on it).

On most systems, those 6000 URLs are likely to make for too long a command line for the execve() system call though, so you'd need to split it, either with xargs or zsh's zargs, though here, combined with printf, xargs would be more handy:

printf 'https://example.com/%1$sx%1$s\0' {1..6000} | xargs -0 wget2 

(essentially the same as @JimL's except it uses the builtin printf instead of BSD jot and by using NUL as delimiter we allow the arguments to contain anything).

Or here using wget2's -i which expects a newline-delimited list (OK for a list of URLs which can't contain newlines):

printf 'https://example.com/%1$sx%1$s\n' {1..6000} | wget2 -i- 

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.