2

I have a problem which I have been able to solve using Stata, but now my data has grown to a size that I cannot process it from memory any more. I hope to do this in MySQL instead. I am trying to calculate the Manhattan distance of items between n groups. I have manipulated the data so far that I hope it is ready to work for the computation:

SELECT * FROM exampleshares; +----------+-------------+-------------+ | item | group | share | +----------+-------------+-------------+ | A | group1 | .3 | | B | group1 | .7 | | A | group2 | .2 | | B | group2 | .6 | | C | group2 | .2 | | A | group3 | .3 | | C | group3 | .6 | +----------+-------------+-------------+ 

The Manhattan Distance of this example would be :

+----------+-------------+-------------+ | groupX | groupY | M distance | +----------+-------------+-------------+ | group1 | group1 | 0 | | group1 | group2 | .4 | | group1 | group3 | 1.3 | | group2 | group1 | .4 | | group2 | group2 | 0 | | group2 | group3 | 1.1 | | group3 | group1 | 1.3 | | group3 | group2 | 1.1 | | group3 | group3 | 0 | +----------+-------------+-------------+ 

For example, the distance between group1 and group2 is computed as |.3-.2|+|.7-.6|+|0-.2|=0.4, ie. the sum of the absolute difference in shares. How do I do this in MySQL?

During my search if found a couple of solutions to calculating the difference to a previous row by group, but nothing to what I am looking for specifically.

1 Answer 1

0

I believe you will have to use a stored routine or some other script to accomplish this. Here is a stored routine that will do it:

delimiter // drop procedure if exists manhattanDistance// create procedure manhattanDistance (in startGroup char(32), in endGroup char(32), out manhattanDistance decimal(2,1)) not deterministic reads sql data begin drop table if exists tmp_items; create temporary table tmp_items as select distinct item from exampleshares; select sum(abs(ifnull(es1.share, 0) - ifnull(es2.share, 0))) into manhattanDistance from tmp_items ti left join exampleshares es1 on es1.item = ti.item and es1.group = startGroup left join exampleshares es2 on es2.item = ti.item and es2.group = endGroup; end// delimiter ; call manhattanDistance('group1', 'group2', @distanceBetweenGroup1And2); select @distanceBetweenGroup1And2; 
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.