Skip to main content
11 of 14
added 54 characters in body; Post Made Community Wiki
chris
  • 23.4k
  • 5
  • 63
  • 154

I did some computation of formal derivatives a while back which might be of interest in this context (though keep in mind that this is anything but bullet proof! it does work for the cases I bothered to check though).

Clear[a]; Format[a[k_]] = Subscript[a, k] 

Let us say we have an objective function which is formally a function of the vector a[i]

Q = Sum[Log[Sum[a[r] Subscript[B, r][Subscript[x, i]], {r, 1, p}]/ Sum[a[r] , {r, 1, p}]], {i, 1, n}] 

Mathematica graphics

Let us define a couple of rules for formal differentiation

Clear[d]; d[Log[x_], a[k_]] := 1/x d[x, a[k]] d[Sum[x_, y__], a[k_]] := Sum[d[x, a[k]], y] d[ a[k_] b_., a[k_]] := b /; FreeQ[b, a] d[ a[q_] b_., a[k_]] := b Subscript[δ, k, q] /; FreeQ[b, a] d[ c_ b_, a[k_]] := d[c, a[k]] b + d[b, a[k]] c d[ b_ + c_, a[k_]] := d[c, a[k]] + d[b, a[k]] d[Subscript[δ, r_, q_], a[k_]] := 0 d[x_, a[k_]] := 0 /; FreeQ[x, a] d[G_^n_, a[k_]] := n G^(n - 1) d[G , a[k]] Unprotect[Sum]; Attributes[Sum] = {ReadProtected};Protect[Sum]; 

And a rule to deal with Kroneckers

ds = {Sum[a_ + b_, {s_, 1, p_}] :> Sum[a, {s, 1, p}] + Sum[b, {s, 1, p}], Sum[ y_ Subscript[δ, r_, s_], {s_, 1, p_}] :> (y /. s -> r), Sum[ y_ Subscript[δ, s_, r_], {s_, 1, p_}] :> (y /. s -> r), Sum[ Subscript[δ, s_, r_], {r_, 1, p_}] :> 1, Sum[δ[i_, k_] δ[j_, k_] y_. , {k_, n_}] -> δ[i, j] (y /. k -> i), Sum[a_ b_, {r_, 1, p_}] :> a Sum[b, {r, 1, p}] /; NumberQ[a]} 

Then the gradient of Q with respect to one of the a[k] reads

grad = d[Q, a[k]] /. ds // Simplify; 

Mathematica graphics

Similarly the tensor of second derivatives w.r.t. a[k] and a[s] is

hess = d[d[Q, a[k]], a[s]] /. ds // Simplify 

Mathematica graphics

As a last less trivial example let us consider the 4th order derivatives of Q

 d[d[d[d[Q, a[k]], a[s]], a[m]], a[t]]; /. ds // Simplify 

Mathematica graphics

For the problem at hand we check easily that

 Q = Sum[r a[r] , {r, 1, p}]; grad = d[Q, a[k]] // Simplify; grad //. ds 

returns k as it should

EDIT

This process can be made a bit more general, say, on this Objective function

 Q = 1/2 Sum[(Sum[a[r] Subscript[B, r, i][a[q]], {r, 1, p}] - Subscript[y, i])^2, {i, 1, n}] 

Mathematica graphics

which depends non linearly on a[k] via B.

All we need is to add a new rule for d

d[H_[a[q_]], a[k_]] := (D[H[x] , x] /. x -> a[k] ) Subscript[δ, k, q] 

Now we readily have

grad = d[Q, a[k]] // Simplify; hess = d[d[Q, a[k]], a[s]]; grad //. ds 

Mathematica graphics

hess /. ds // Simplify 

Mathematica graphics

As a final example, let us look at a parametrized entropy distance,

 Q = -Sum[(Sum[a[r] Subscript[B, r, i], {r, 1, p}]/ Subscript[y, i]) Log[(Sum[a[r] Subscript[B, r, i], {r, 1, p}]/ Subscript[y, i])], {i, 1, n}] 

Mathematica graphics

we can compute its Hessian while mapping twice the sum rule

 Map[# /. ds &, d[d[Q, a[k]], a[s]] /. ds] 

Mathematica graphics

chris
  • 23.4k
  • 5
  • 63
  • 154