In Cover and Thomas, "Elements of Information Theory", second edition, I encountered this question:
Let $K$ and $K_0$ be symmetric positive definite matrices of same size (s.p.d.m). Then show that (det is determinant here) $$f(K)=\log\left(\frac{\det(K+K_0)}{\det{(K)}}\right)$$ is a convex function of $K$.
As a side note, one of the best things about information theory is that it can be used to prove unusual determinant identities and inequalities such as Hadamard's inequality and so on. I am only looking for an "information theoretic" proof and not any other proofs. As an example,
Prove that if $K_1$ and $K_2$ are s.p.d.m, then $$\det (K_1 + K_2) \geq \max \{\det(K_1),\det(K_2)\}$$ Proof:
Let $X \sim N(0,K_1)$ and $Y \sim N(0,K_2)$ independent of each other. Then we have that (here $n$ is the dimension of the matrix and $h(.)$ is the differential entropy function) $$h(X+Y) \geq h(X+Y|X) = h(Y)$$ Hence, we have $$\frac{1}{2}\log((2\pi e)^n\det(K_1+K_2)) \geq \frac{1}{2}\log((2\pi e)^n\det(K_2))$$ The result will follow from this.
Here is my attempt at the original problem:
We need to show for $\lambda \in (0,1)$ $$\log\left(\frac{\det(\lambda K_1 + (1-\lambda)K_2 +K_0 )}{\det(\lambda K_1 + (1-\lambda)K_2)}\right) \leq \lambda \log\left(\frac{\det(K_1+K_0)}{\det{(K_1)}}\right) + (1-\lambda)\log\left(\frac{\det(K_2+K_0)}{\det{(K_2)}}\right)$$ Let $X_1 \sim N(0,K_1)$, $X_2 \sim N(0,K_2)$ and $Y\sim N(0,K_0)$ all mutually independent. Let $v$ be a random variable which is $1$ w.p. $\lambda$ and $2$ w.p. $1-\lambda$. Let $K_v =\lambda K_1 + (1-\lambda)K_2$. Note that $E[X_vX_v^T]= K_v$. Let $\widetilde{X}\sim N(0,K_v)$. Note that $X_v$ is not necessarily multivariate gaussian. Hence it boils down to showing $$h(\widetilde{X}+Y) - h(\widetilde{X}) \leq h(X_v+Y|v) - h(X_v|v)$$
I am not sure if the inequality above is correct or not. It's just that it would imply the result. Anyhow I was able to get (since $X_v$ and $\widetilde{X}$ have the same covariance matrix) $$h(\widetilde{X}) \geq h(X_v) \geq h(X_v|v)$$
But I couldn't get a similar chain for $h(\widetilde{X}+Y)$. I tried to use entropy power inequality for this but it didn't seem to work. My guess is that this approach won't work and I should try to manipulate mutual information terms or something.
I appreciate any ideas on this. As always, if something is unclear let me know.