I have a problem which I have been able to solve using Stata, but now my data has grown to a size that I cannot process it from memory any more. I hope to do this in MySQL instead. I am trying to calculate the Manhattan distance of items between n groups. I have manipulated the data so far that I hope it is ready to work for the computation:
SELECT * FROM exampleshares; +----------+-------------+-------------+ | item | group | share | +----------+-------------+-------------+ | A | group1 | .3 | | B | group1 | .7 | | A | group2 | .2 | | B | group2 | .6 | | C | group2 | .2 | | A | group3 | .3 | | C | group3 | .6 | +----------+-------------+-------------+ The Manhattan Distance of this example would be :
+----------+-------------+-------------+ | groupX | groupY | M distance | +----------+-------------+-------------+ | group1 | group1 | 0 | | group1 | group2 | .4 | | group1 | group3 | 1.3 | | group2 | group1 | .4 | | group2 | group2 | 0 | | group2 | group3 | 1.1 | | group3 | group1 | 1.3 | | group3 | group2 | 1.1 | | group3 | group3 | 0 | +----------+-------------+-------------+ For example, the distance between group1 and group2 is computed as |.3-.2|+|.7-.6|+|0-.2|=0.4, ie. the sum of the absolute difference in shares. How do I do this in MySQL?
During my search if found a couple of solutions to calculating the difference to a previous row by group, but nothing to what I am looking for specifically.