GAP, 416 bytes
Won't win on code size, and far from constant time, but uses math to speed up a lot!
x:=X(Integers); z:=CoefficientsOfUnivariatePolynomial; s:=Size; f:=function(n) local r,c,p,d,l,u,t; t:=0; for r in [1..Int((n+1)/2)] do for c in [r..n-r+1] do l:=z(Sum([1..26],i->x^i)^(n-c)); for p in Partitions(c,r) do d:=x; for u in List(p,k->z(Sum([0..9],i->x^i)^k)) do d:=Sum([2..s(u)],i->u[i]*Value(d,x^(i-1))mod x^s(l)); od; d:=z(d); t:=t+Binomial(n-c+1,r)*NrArrangements(p,r)* Sum([2..s(d)],i->d[i]*l[i]); od; od; od; return t; end;
To squeeze out the unnecessary whitespace and get one line with 416 bytes, pipe through this:
sed -e 's/^ *//' -e 's/in \[/in[/' -e 's/ do/do /' | tr -d \\n
My old "designed for Windows XP" laptop can calculate f(10) in less than one minute and go much further in under an hour:
gap> for i in [2..15] do Print(i,": ",f(i),"\n");od; 2: 18 3: 355 4: 8012 5: 218153 6: 6580075 7: 203255386 8: 6264526999 9: 194290723825 10: 6116413503390 11: 194934846864269 12: 6243848646446924 13: 199935073535438637 14: 6388304296115023687 15: 203727592114009839797
How it works
Suppose that we first only want to know the number of perfect license plates fitting the pattern LDDLLDL, where L denotes a letter and D denotes a digit. Assume we have a list l of numbers such that l[i] gives the number of ways the letters can give the value i, and a similar list d for the values we get from the digits. Then the number of perfect licence plates with common value i is just l[i]*d[i], and we get the number of all perfect licence plates with our pattern by summing this over all i. Let's denote the operation of getting this sum by l@d.
Now even if the best way to get these lists was to try all combinations and count, we can do this independently for the letters and the digits, looking at 26^4+10^3 cases instead of 26^4*10^3 cases when we just run through all plates fitting the pattern. But we can do much better: l is just the list of coefficients of (x+x^2+...+x^26)^k where k is the number of letters, here 4.
Similarly, we get the numbers of ways to get a sum of digits in a run of k digits as the coefficients of (1+x+...+x^9)^k. If there is more than one run of digits, we need to combine the corresponding lists with an operation d1#d2 that at position i has as value the sum of all d1[i1]*d2[i2] where i1*i2=i. This is the Dirichlet convolution, which is just the product if we interpret the lists as coefficients of Dirchlet series. But we have already used them as polynomials (finite power series), and there is no good way to interpret the operation for them. I think this mismatch is part of what makes it hard to find a simple formula. Let's use it on polynomials anyway and use the same notation #. It is easy to compute when one operand is a monomial: we have p(x) # x^k = p(x^k). Together with the fact that it is bilinear, this gives a nice (but not very efficient) way to compute it.
Note that k letters give a value of at at most 26k, while k single digits can give a value of 9^k. So we will often get unneeded high powers in the d polynomial. To get rid of them, we can compute modulo x^(maxlettervalue+1). This gives a big speed up and, though I didn't immediately notice, even helps golfing, because we now know that the degree of d is not bigger then that of l, which simplifies the upper limit in the final Sum. We get an even better speed up by doing a mod calculation in the first argument of Value (see comments), and doing the whole # computation at a lower level gives an incredible speed up. But we're still trying to be a legitimate answer to a golfing problem.
So we have got our l and d and can use them to compute the number of perfect licence plates with pattern LDDLLDL. That is the same number as for the pattern LDLLDDL. In general, we can change the order of the runs of digits of different length as we like, NrArrangements gives the number of possibilities. And while there must be one letter between the runs of digits, the other letters are not fixed. The Binomial counts these possibilities.
Now it remains to run through all possible ways of having lengths of runs digits. r runs through all numbers of runs, c through all total numbers of digits, and p through all partitions of c with r summands.
The total number of partitions we look at is two less than the number of partitions of n+1, and the partition functions grows like exp(sqrt(n)). So while there are still easy ways to improve the running time by reusing results (running through the partitions in a different order), for a fundamental improvement we need to avoid looking at each partition seperately.
Computing it fast
Note that (p+q)@r = p@r + q@r. On its own, this just helps avoid some multiplications. But together with (p+q)#r = p#r + q#r it means that we can combine by simple addition polynomials corresponding to different partitions. We can't just add them all, because we still need to know with which l we have to @-combine, which factor we have to use, and which #-extensions are still possible.
Let's combine all polynomials corresponding to partitions with the same sum and length, and already account for the multiple ways of distributing the lengths of runs of digits. Different from what I speculated in the comments, I don't need to care about the smallest used value or how often it is used, if I make sure that I won't extend with that value.
Here is my C++ code:
#include<vector> #include<algorithm> #include<iostream> #include<gmpxx.h> using bignum = mpz_class; using poly = std::vector<bignum>; poly mult(const poly &a, const poly &b){ poly res ( a.size()+b.size()-1 ); for(int i=0; i<a.size(); ++i) for(int j=0; j<b.size(); ++j) res[i+j]+=a[i]*b[j]; return res; } poly extend(const poly &d, const poly &e, int ml, poly &a, int l, int m){ poly res ( 26*ml+1 ); for(int i=1; i<std::min<int>(1+26*ml,e.size()); ++i) for(int j=1; j<std::min<int>(1+26*ml/i,d.size()); ++j) res[i*j] += e[i]*d[j]; for(int i=1; i<res.size(); ++i) res[i]=res[i]*l/m; if(a.empty()) a = poly { res }; else for(int i=1; i<a.size(); ++i) a[i]+=res[i]; return res; } bignum f(int n){ std::vector<poly> dp; poly digits (10,1); poly dd { 1 }; dp.push_back( dd ); for(int i=1; i<n; ++i){ dd=mult(dd,digits); int l=1+26*(n-i); if(dd.size()>l) dd.resize(l); dp.push_back(dd); } std::vector<std::vector<poly>> a; a.reserve(n); a.push_back( std::vector<poly> { poly { 0, 1 } } ); for(int i=1; i<n; ++i) a.push_back( std::vector<poly> (1+std::min(i,n+i-i))); for(int m=n-1; m>0; --m){ // std::cout << "m=" << m << "\n"; for(int sum=n-m; sum>=0; --sum) for(int len=0; len<=std::min(sum,n+1-sum); ++len){ poly d {a[sum][len]} ; if(!d.empty()) for(int sumn=sum+m, lenn=len+1, e=1; sumn+lenn-1<=n; sumn+=m, ++lenn, ++e) d=extend(d,dp[m],n-sumn,a[sumn][lenn],lenn,e); } } poly let (27,1); let[0]=0; poly lp { 1 }; bignum t { 0 }; for(int sum=n-1; sum>0; --sum){ lp=mult(lp,let); for(int len=1; len<=std::min(sum,n+1-sum); ++len){ poly &a0 = a[sum][len]; bignum s {0}; for(int i=1; i<std::min(a0.size(),lp.size()); ++i) s+=a0[i]*lp[i]; bignum bin; mpz_bin_uiui( bin.get_mpz_t(), n-sum+1, len ); t+=bin*s; } } return t; } int main(){ int n; std::cin >> n; std::cout << f(n) << "\n" ; }
This uses the GNU MP library. On debian, install libgmp-dev. Compile with g++ -std=c++11 -O3 -o pl pl.cpp -lgmp -lgmpxx. The program takes its argument from stdin. For timing, use echo 100 | time ./pl.
At the end a[sum][length][i] gives the number of ways in which sum digits in length runs can give the number i. During computation, at the beginning of the m loop, it gives the number of ways that can be done with numbers greater than m. It all starts with a[0][0][1]=1. Note that this is a superset of the numbers we need to compute the function for smaller values. So at almost the same time, we could compute all values up to n.
There is no recursion, so we have a fixed number of nested loops. (The deepest nesting level is 6.) Each loop goes through a number of values that is linear in n in the worst case. So we need only polynomial time. If we look closer at the nested i and j loops in extend, we find an upper limit for j of the form N/i. That should only give a logarithmic factor for the j loop. The innermost loop in f (with sumn etc) is similar. Also keep in mind that we compute with numbers that grow fast.
Note also that we store O(n^3) of these numbers.
Experimentally, I get these results on reasonable hardware (i5-4590S): f(50) needs one second and 23 MB, f(100) needs 21 seconds and 166 MB, f(200) needs 10 minutes and 1.5 GB, and f(300) needs one hour and 5.6 GB. This suggests a time complexity better than O(n^5).
N. \$\endgroup\$