In my application, I perform a topological sort of nodes where an integer represents a node's ID using DFS and white/gray/black coloring.
See https://yuminlee2.medium.com/topological-sort-cf9f8e43af6a and the section titled "Method 2" for the technique.
As you can see in the example cited above (and typically for all samples of depth-first search I can find) the topological sort occurs with integers and results in an ordered list of integers representing nodes in the graph.
This worked great until I needed to represent a non-node as part of the graph. In my case, I needed the concept of a "system" of nodes.
This is a small tangent to explain the concept of "system". Consider a motor vehicle comprised of "parts." Each part is a node in the graph. For example, a "bolt' is a part. a "rubber hose" is a part.
Now consider an "engine". The engine itself is not a "part". Rather it is a "system" of parts that are related so intimately that we can call them an "engine". The "engine" is dependent on multiple parts (camshaft, block, pistons, sparkplugs, etc). It is an abstract concept since it has no exact "part" itself but nevertheless it has a definable set of parts that it depends on.
Because it has a directed, acyclic relationship to those specific parts that belong in the engine "system", the abstract concept of engine can be sorted.
BUT we arrive at the problem, as we now need an integer to represent not a physical part but an abstract system.
And in reality, we have multiple abstract overlapping systems; for example, an "engine" belongs to a "drivetrain" system; a "master cylinder" belongs to a "hydraulic" system and a "electrical" system; An "alternator" belongs to both an "engine" system and an "electrical" system.
These "systems" are all comprised of valid acyclic relationships that overlap in may ways but are nevertheless acyclic (ignore cyclic conditions for now)
As explained in the above vehicle example, and using the logic of graphs, we can always determine a sort order for both the parts and the systems of parts even if the systems overlap.
Previously in my app design, integers represented nodes/objects (i.e. parts) and always mapped directly to the physical object (the allocated class instance); now the graph sort is a mix of objects and systems.
Speaking in terms of software engineering (using C# language), in the old design, we could allocate Int32.MaxValue numbers of parts/nodes to store in a List<Part> where their integer ID represents an index in the list and Part is an abstract class instance/allocation.
But now, in order to perform a DFS topological sort, we need to assign integers to objects and arbitrary systems to perform the sort. Intuitively, if the integer id used for the sort was previously also an index into an array of allocations, this does not correspond cleanly with the introduction of systems.
Assuming the allocated nodes might occupy the range of integers from zero to N, I can only see two approaches to assigning integer id's to the non-nodes:
- The app logic needs to reserve an integer range to limit the number of nodes, let's call it
Nodes.MaxCount. By assigning an arbitrary upper bound, the range ofInt32.MaxValue - Nodes.MaxCountbecomes the available range of integers for non-nodes (i.e. systems). Then a system can get an id from the reserved range. - If I could perform a topological sort on a two-dimensional ID like
(int domain, int id)represented here as a tuple, it would enable a sort onInt32.MaxValuenumber of systems withInt32.MaxValuenumbers of elements in each system. Then a node in the graph could be of type(Node,id)or(System,id)and the integers will be able to handle overlaps (i.e. we can have both a node with id '1' and also a System with id '1').
The second clearly has more expressiveness and available range, but I don't know how to refactor a typical white/black topological sort to use this two-dimensional id (or if it is even possible).
EDIT:
Let's consider a hash map using the motor vehicle example. Some pseudocode:
enum SystemType { Default, Engine, Drivetrain, Hydraulic, Electric } Dictionary<(SystemType,guid),List<???>> allparts = new(); // is a hash map in C# Bolt b = new Bolt(); b.ID = Guid.NewGuid(); allparts.Add((SystemType.Default,b.ID),b.GetDependentParts()); allparts.Add((SystemType.Engine,b.ID),b.GetDependentEngineSpecificParts()); allparts.Add((SystemType.Drivetrain,b.ID),b.GetDependentDriveTrainSpecificParts()); This is some minimal pseudocode that attempts to store system-specific dependencies in a hash map with key type of (SystemType,guid). This tuple is easy to hash into a hashcode integer. Therefore, the "drivetrain" system can perform a lookup in the hash map to get a set of relevant parts to be used in sorting. The problem is that this logic is awful on space complexity and appears to be storing a logarithmic number of hash map entries for the elements. O(N_elements*N_systems). Also, The List<???> represents an array of dependent "somethings" since we are trying to solve this without integers ??? must be a type of something-else other than integers.
I'm not opposed to the hash map route, but this code is not optimal.

(node, list of nodes)which represents "parent -> children" relationship.