1

The goal is to find the optimal description of a subset of countries, using a combination of individual countries and groups of countries that are predefined.

Set of countries is comprised of all 2 letter ISO country codes, (US/DE/NL/etc)

Groups of countries are fixed, for example:

  • GROUP1: NL,BE,LU
  • GROUP2: NL,BE,LU,CZ,US,FI,NO,GI,FR,DK

I want to have a representation of this list of countries:

  • CZ,US,FI,NO,GI,FR,DK,DE

I would like to shorten this using the defined groups, to the representation with the least amount of elements. For this example that would be:

  • +GROUP2 +DE -GROUP1

Because if we expand this back out we get:

  • Add GROUP2: NL,BE,LU,CZ,US,FI,NO,GI,FR,DK
  • Add DE
  • Remove GROUP1: NL,BE,LU Which brings us back to CZ,US,FI,NO,GI,FR,DK,DE

Any pointers to libraries or examples are welcome in any programming language

I'm at a loss what a good approach would be. I've tried looking into existing algorithms, comparable problems, but I can't seem to wrap my head around the problem and find a good fit.

Just to illustrate a simple brute force solution that will obviously not scale:

let countries = ['C1', 'C2', 'C3', 'C4', 'C5', 'C6'] let groups = { G1: ['C1', 'C2', 'C3', 'C4', 'C5'], G2: ['C1', 'C4'], } let output = getBest(['C2', 'C3', 'C5', 'C6']) // output == ["+C6", "+G1", "-G2"] function getBest(input) { const ids = countries.concat(Object.keys(groups)) let best = input for (const t of combinations(ids)) { if (expand(t, groups).sort().toString() == input.toString()) { if (t.length < best.length) best = [...t] } } return best } // Expands short form to long form function expand(s, groups) { return Array.from( s.sort().reduce((acc, curr) => { let symbol = curr[0] let id = curr.slice(1) if (groups[id]) { curr = groups[id] } else { curr = [id] } if (symbol == '+') { return new Set([...acc, ...curr]) } else { return new Set([...acc].filter((a) => !curr.includes(a))) } }, new Set()) ) } // Yields all possible short form options function* combinations(array) { array = ['+', '-'].reduce((acc, curr) => { return acc.concat(array.map((s) => curr + s)) }, []) for (const subset of subsets(array)) { yield subset } } // Creates powerset of array function* subsets(array, offset = 0) { while (offset < array.length) { let first = array[offset++] for (let subset of subsets(array, offset)) { subset.push(first) yield subset } } yield [] } 
2
  • 1
    Could you further elaborate on how you define the shortest possible description? Tbh, your example is not really clear (at least to me). That said, one approach is probably to model (and solve) your problem as an mixed-integer optimization problem. Commented Jan 4, 2023 at 14:10
  • @joni I tried to clarify my question a little bit. I'd like to describe a set of countries by substituting groups with their names. Or, if it's advantageous, remove a country or smaller group from a bigger group. Commented Jan 4, 2023 at 14:59

1 Answer 1

1

To me, the problem does not sound like there exists a classical model for it with a well-known polynomial-time algorithm.

A reasonable approach looks as follows.

Consider formulas in a formal language, like (G1 subtract G2) unite (G3 intersect G4), where Gi are predefined groups (and perhaps individual countries, but that will slow the solution down a lot).

Each formula's score is its length plus the size of difference with the desired answer (so that we add or subtract individual elements as the last step of the formula).

Now, for formulas of lengths 0, 1, 2, ..., (iterative deepening), recursively generate all possible formulas of such length and consider their score. Stop when the length reaches the score of the best answer so far.

There's some room to optimize (for example, prune clearly stupid branches like Gi symdiff Gi, or perhaps memoize the shortest formula for each set we can obtain), but the solution is nevertheless exponential in the number of items and groups.

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.