Python 2, 157 bytes
def f(s,o=0,d=0,D={}):T=s,o,d;x=D[T]=D[T]if T in D else~o and 0**o+sum(f(s[1:],cmp(c,"[")%-3-~o,d or cmp(c,s[0]))for c in"+,-.<>[]")if s else~d<0==o;return+x
Still looks pretty golfable, but I'm posting this for now. It uses recursion with a bit of caching. Annoyingly, D.get doesn't short circuit for the caching, so I can't save 9 bytes that way...
The mapping prioritises length first, then lexicographical order over the ordering "][><.-,+" (see output examples below). The main idea is to compare prefixes.
The variable o keeps track of the number of [ brackets still open for the current prefix, while the variable d takes one of three values indicating:
d = 1: The current prefix is lexicographically earlier than s. Add all programs with this prefix and length <= s, d = -1: The current prefix is lexicographically later than s. Add all programs with this prefix and length < s. d = 0: The current prefix is a prefix of s, so we might change d to 1 or -1 later.
For example, if we have s = "[-]" and our current prefix is p = "+", since p is later than s lexicographically we know only to add the programs starting with p which are strictly shorter than s.
To give a more detailed example, suppose we have an input program s = "-[]". The first recursive expansion does this:
(o == 0) # Adds a program shorter than s if it's valid # For the first expansion, this is 1 for the empty program + f(s[1:], o=-1, d=1) # ']', o goes down by one due to closing bracket + f(s[1:], o=1, d=1) # '[', o goes up by one due to opening bracket + f(s[1:], o=0, d=1) # '>' + f(s[1:], o=0, d=1) # '<' + f(s[1:], o=0, d=1) # '.', d is set to 1 for this and the previous branches # since they are lexicographically earlier than s's first char + f(s[1:], o=0, d=0) # '-', d is still 0 since this is equal to s's first char + f(s[1:], o=0, d=-1) # ',', d is set to -1 for this and the later branches # since they are lexicographically later than s's first char + f(s[1:], o=0, d=-1) # '+'
Note how we don't actually use the prefixes in the recursion - all we care about them is captured through the variables d, o and the shrinking input program s. You'll notice a lot of repetition above - this is where caching comes in, allowing us to process 100-char programs well within the time limit.
When s is empty, we look at (d>=0 and o==0), which decides whether to return 1 (count this program because it's lexicographically early/equal and the program is valid), or 0 (don't count this program).
Any situtation with o < 0 immediately returns 0, since any programs with this prefix have more ]s than [, and are thus invalid.
The first 20 outputs are:
1 > 2 < 3 . 4 - 5 , 6 + 7 [] 8 >> 9 >< 10 >. 11 >- 12 >, 13 >+ 14 <> 15 << 16 <. 17 <- 18 <, 19 <+ 20
Using the same Hello World example as @TheNumberOne's answer:
>>> f("++++++++[>++++[>++>+++>+++>+<<<<-]>+>+>->>+[<]<-]>>.>---.+++++++..+++.>>.<-.<.+++.------.--------.>>+.>++.") 3465145076881283052460228065290888888678172704871007535700516169748342312215139431629577335423L