Correct me if I'm wrong, but having read a lot about the ECS-Pattern in general and in the context of current game engines I am beginning to wonder if the ECS-Pattern in it's core is hardly more than an object-oriented design in disguise with added reusability?
To clarify my point, first of all there are good answers here describing the general function how an ECS approach works (e.g. here, though I would argue about the HashMap approach). Furthermore Unity added alot of functionality in this regard in its 2019 DOTS.
In this unity blog describing the changes and differences to traditional "gameobjects" and the rough implementation Unity uses, one paragraph raised some questions:
[...] In an earlier draft of this post I wrote “we store entities in chunks”, and later changed it to “we store component data for entities in chunks”. It’s an important distinction to make, to realize that an Entity is just a 32-bit integer. There is nothing to store or allocate for it, other than the data of its components. Because they’re so cheap, you can use them for scenarios that game objects weren’t suitable for. Like using an entity for each individual particle in a particle system.
There is - what I'd call - a hype about the ECS-Pattern and why it is so much better than a traditional approach using objects as entities and components, and their class as the systems part. Having done my fair share of projects in different topics and different programming languages, an instance of a class without any variables or functions is, on an abstract level, nothing more than a statement of it's existence, most likely an address in memory.
So it would also satisfy this part
to realize that an Entity is just a 32-bit integer. There is nothing to store or allocate for it
of the quote (apart from the name).
Moreover a class is a collection of variables (or fields/members or any other word for it) which could also be called data components. Together with the class's functions .. is it an ECS based design if you group the objects based on their class when processing multiple objects?
If you compare the ECS-Pattern to a traditional particle system, which usually consists of arrays of variables so it can store different data for each of it's particles, the similarities are even more obvious.
public class ParticleSystem{ int numParticles; float[] x; float[] y; float[] size; public void update() { move(); resize(); } public void move() { for (int i = 0; i < numParticles; ++i) { x[i] += 0.2f; y[i] -= 0.1f; } } public void resize() { for (int i = 0; i < numParticles; ++i) { size[i] *= 0.95f; } } } Take this over-simplified particle system as an example. It would be treated as a traditional "gameobject". Now you could write it differently ..
public class Position { float x, y; } public class Size { float size; } public class ParticleSystem{ int numParticles; Position[] positions; Size[] sizes; MoveSystem moveSystem; SizeSystem sizeSystem; public void update() { moveSystem.execute(); sizeSystem.execute(); } } public class MoveSystem { Position[] positions; public MoveSystem (Position[] pos) { positions = pos; } public execute() { for (int i = 0; i < positions.length; ++i) { positions[i].x += 0.2f; positions[i].y -= 0.1f; } } } public class ResizeSystem { Size[] sizes; public SizeSystem(Size[] s) { sizes = s; } public execute() { for (int i = 0; i < sizes.length; ++i) { sizes[i].size *= 0.95f; } } } .. and it looks like an ECS-based particle system with the main class as the "manager" of the entities. Every particle(entity) is just a 32-bit integer (it's index). There is nothing to store or allocate for it, other than the data of its [two] components.
(I do know this is neither elegant nor complete code)
You could write every class inheriting from a basic "gameobject" as a collection of instances. Imagine you have a virtual meadow with sheep roaming around. They need various different components for logic, physics, rendering and ai but you could easily design it like a classic particle system.
And if you read the whole post it describes a "archetype" class with up to 16k per "chunk" holding only entities of the same kind and running systems in a linear loop over all the chunks of all archetypes which feature the same components and all entities in those chunks. Which sounds a lot like "particle systems" for each archetype of entities. It just hides the archetype class from the user ..?
Edit1:
I am not struggling to understand the specifics of how to implements an ECS-Pattern or how it works on a functional level, rather the aspects an reasons why it is considered by many to be vastly superior to an OOP design. This requires a view of the entire system, from the software layer right down to the hardware considerations.
One common argument in favor of ECS is the RAM allocation and the improvements on cache-miss-rates - and therefore performance - for both data and instructions, as a component-based grouping of data and system-based processing of the components allows for more efficient access to consecutive data.
Example: x Entities with components A through Z
Why is an ECS design like
A[x]; B[x]; ... Z[x]; considered more efficient than
(AB..Z)[x]; ..?
In the first case you could cache a single component per read for (potentially) multiple entities at once. But depending on your system you need multiple reads from different locations on the RAM. The second case could (potentially) cache a single object per read (more likely a single object with multiple consecutive reads). And although RAM is designed for random access, different sources claim consecutive access is still a lot faster.
I do get the fact a system doesn't need all the data of a single object. Which, as I see it, is the strongest argument on this level. This article from Intel's developer zone includes an illustration of the problem with fragmented RAM on a class-based view of the system. If you change the view from a class-based to a per-byte view, ECS would have the same problem. The components a systems requires to operate are stored in seperate arrays so the system regularly needs to access non-consecutive memory. Which is the flaw of an OOP based memory design that is pointed out in the article, as the prefetching needs to learn what to load next?
Edit2: addressing the comment from Quentin
Again, I am not concerned on the functional design of how a system discovers all relevant entities, but on the "decision-making" process which objects are relevant.
Imagine you want to model two kinds of entities. Returning to the example from this Q&A the first kind of entity is the ball which just falls in the direction of the global gravity.
spd += global.gravity * global.deltaTime; pos += spd * global.deltaTime; The second kind of entity shall be an alien ball, which is negatively affected by gravity.
spd += -global.gravity * global.deltaTime; pos += spd * global.deltaTime; Notice the - global.gravity this time. Disregarding how the example's system discovers all eligible objects, as long as the systems operates on all objects that feature a speed and a position component - that are eligible - , the questions boils down to what decides if an object is eligible?
As systems are not exclusive, i.e. a system that operates on components A and B, and a system that operate on components A, B and C both process the same object (if the object in question does contain all three components) how would you differentiate different between different behaviours without adding additional components to all entities?
If you were to add a gravity component to the alien ball entities to address the different gravity, the "Fall" system would still process the alien balls as they still have the speed and position components. An "AlienFall" system, which takes speed, position and gravity components would then only process the alien balls with the correct gravity and both systems would negate each other (in terms of the speed calculation) and the result would vary based on the order in which the systems are executed.
Unifying both systems by adding a gravity component to every entity would rather contradict the ECS efficiency paradigmas as you effectively add lots of data compared to the information it represents.