0
$\begingroup$

I have calculated 26 MFCCs for two sample speech data. My mfcc matrices thus contain 26 columns and 120 rows each, where 120 is the number of frames. Now I want to apply DTW on them and I am doing this (on MATLAB):

mfcc1=mfcc1'; mfcc2=mfcc2'; M=simmx(mfcc1,mfcc2); [p,q,c]=dp(1-M); v=c(size(c,1),size(c,2)) 

which I have taken from this post

But I don't quite understand why the similarity matrix and why it has to be 1-M ?

Also if I exclude the first co-efficient, the result becomes totally invalid i.e it seems the first co-efficient is mandatory for DTW.

Is there something wrong with my approach? If it is, then how can I make it right?

dp.m

function [p,q,D] = dp(M) % [p,q] = dp(M) % Use dynamic programming to find a min-cost path through matrix M. % Return state sequence in p,q % 2003-03-15 [email protected] % Copyright (c) 2003 Dan Ellis <[email protected]> % released under GPL - see file COPYRIGHT [r,c] = size(M); % costs D = zeros(r+1, c+1); D(1,:) = NaN; D(:,1) = NaN; D(1,1) = 0; D(2:(r+1), 2:(c+1)) = M; % traceback phi = zeros(r,c); for i = 1:r; for j = 1:c; [dmax, tb] = min([D(i, j), D(i, j+1), D(i+1, j)]); D(i+1,j+1) = D(i+1,j+1)+dmax; phi(i,j) = tb; end end % Traceback from top left i = r; j = c; p = i; q = j; while i > 1 & j > 1 tb = phi(i,j); if (tb == 1) i = i-1; j = j-1; elseif (tb == 2) i = i-1; elseif (tb == 3) j = j-1; else error; end p = [i,p]; q = [j,q]; end % Strip off the edges of the D matrix before returning D = D(2:(r+1),2:(c+1)); 

simmx.m

function M = simmx(A,B) % M = simmx(A,B) % calculate a sim matrix between specgram-like feature matrices A and B. % size(M) = [size(A,2) size(B,2)]; A and B have same #rows. % 2003-03-15 [email protected] % Copyright (c) 2003 Dan Ellis <[email protected]> % released under GPL - see file COPYRIGHT EA = sqrt(sum(A.^2)); EB = sqrt(sum(B.^2)); %ncA = size(A,2); %ncB = size(B,2); %M = zeros(ncA, ncB); %for i = 1:ncA % for j = 1:ncB % % normalized inner product i.e. cos(angle between vectors) % M(i,j) = (A(:,i)'*B(:,j))/(EA(i)*EB(j)); % end %end % this is 10x faster M = (A'*B)./(EA'*EB); 
$\endgroup$
2
  • $\begingroup$ What is [p,q,c]=dp(1-M); Can you provide the matlab reference for the dp() function, I googled but could not find it quickly. This will help me understand what is going on with 1-M $\endgroup$ Commented Apr 4, 2017 at 16:24
  • $\begingroup$ @ruohoruotsi : Added the code.. $\endgroup$ Commented Apr 5, 2017 at 5:23

1 Answer 1

2
$\begingroup$

Let's unpack this code:

1 mfcc1=mfcc1'; 2 mfcc2=mfcc2'; 3 M=simmx(mfcc1,mfcc2); 4 [p,q,c]=dp(1-M); 5 v=c(size(c,1),size(c,2)) 

in line 3, M=simmx(mfcc1,mfcc2); is computing the Cosine Similarity, i.e. normalized inner product or the cosine of the angle between them. Here are my comments about why the authors used 1-M instead of M.

When computing dtw, you want to find the lowest-cost path through the "cost" matrix. Dan Ellis's implementation uses dynamic programming to find the lowest-cost path between the opposite corners of the cost matrix.

However Cosine Similarity (whose values are on the range [-1,1]) is not a proper distance metric (i.e. not strictly positive), but its complement, the Cosine Distance is more appropriate as a cost measure (i.e. strictly positive). The complement is computed by simply taking 1-M.

The Wikipage https://en.wikipedia.org/wiki/Cosine_similarity, elaborates a bit better: enter image description here

enter image description here

Finally, have a look at the author's (Dan Ellis) original code: https://labrosa.ee.columbia.edu/matlab/dtw/ You can see the comments about 1-M as well as the matrix he uses to illustrate the the lowest cost path. Good luck!

$\endgroup$

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.