Revisions to How does Terraria generate such large worlds initially?

Correcting a misplaced "MB" where it should be counting tiles, not megabytes

edited Sep 9, 2021 at 13:19

140.8k
23
257
401

Couple of additions to Steven's informative answer, at a technical / processing level... speaking here in a general sense of the techniques used to generate large worlds (not Terraria specifically).

Generating the initial space

Firstly, there are fast block copies (memcpy / memset in C) used to allocate the initial memory (2D array) representing the world. The entire initial world can consist of zeroes, or structures all of whose members are set to zero. This could represent either sky or ground. Let's assume the whole maps start as sky, so zeroes represent empty space. This is a very fast operation. If the entire 20MB20 million tile map uses 4 bytes per tile, that is 80MB - large, yes - but not ridiculously so by any modern measure. Expect milliseconds or less to allocate such a large space to main memory (RAM).

Terrain can be rapidly written (& read) in a couple of ways.

memset, or series of memsets on a 2D array, most of which resides in main memory (RAM) + regular array access to read; you could even chunk this into smaller arrays (but many of them) to operate on smaller areas. Still, this is not particularly efficient; see below.
instead, the whole world may be represented as RLE columns, running from top to bottom. This makes access much faster, since an entire column of 2400 high can be represented as nothing but a min value (2 bytes), max value (2 bytes), and the tile type (typically 1 byte) = 5 bytes total. Less data = less time to access it. If you have earth (basic stone) and sky in the same column, that's 2 such entries = 10 bytes. For the whole map then, before you have started making elaborate details, that's 10 bytes * 8400 = 84000 bytes, or about 84kB. That's miniscule, and fits into CPU L1 cache on pretty much all systems, making access super fast. Now let's imagine you have at least 50 different terrain variations (types) per column, then that is 50 * 84kB = +- 4MB. Still very small.

Data access speeds improve enormously when a dataset becomes small enough to fit into CPU L1 & L2 cache, as in RLE, rather than our RAM-sized (80MB) 2D array representation.

Generating detail

We could read from our RLE-encoded world map and temporarily copy areas into a small 2D array (for example 100x100) to create a pyramid, a small dungeon, a village etc. Once work is done, we copy these back into the RLE structure using an optimal function we've built for that purpose. What this means is that 99%+ of the entire world space is ignored as we elaborate in just that small area. As the world is gradually generated, we probably touch less than 20% of the entire world space.

The general principle in such games / worlds is...

"Don't explicitly specify (in memory) what can be safely assumed."

This is the principle which RLE (and other spatial subdivision / compression techniques) rely on.

Couple of additions to Steven's informative answer, at a technical / processing level... speaking here in a general sense of the techniques used to generate large worlds (not Terraria specifically).

Generating the initial space

Firstly, there are fast block copies (memcpy / memset in C) used to allocate the initial memory (2D array) representing the world. The entire initial world can consist of zeroes, or structures all of whose members are set to zero. This could represent either sky or ground. Let's assume the whole maps start as sky, so zeroes represent empty space. This is a very fast operation. If the entire 20MB tile map uses 4 bytes per tile, that is 80MB - large, yes - but not ridiculously so by any modern measure. Expect milliseconds or less to allocate such a large space to main memory (RAM).

Terrain can be rapidly written (& read) in a couple of ways.

memset, or series of memsets on a 2D array, most of which resides in main memory (RAM) + regular array access to read; you could even chunk this into smaller arrays (but many of them) to operate on smaller areas. Still, this is not particularly efficient; see below.
instead, the whole world may be represented as RLE columns, running from top to bottom. This makes access much faster, since an entire column of 2400 high can be represented as nothing but a min value (2 bytes), max value (2 bytes), and the tile type (typically 1 byte) = 5 bytes total. Less data = less time to access it. If you have earth (basic stone) and sky in the same column, that's 2 such entries = 10 bytes. For the whole map then, before you have started making elaborate details, that's 10 bytes * 8400 = 84000 bytes, or about 84kB. That's miniscule, and fits into CPU L1 cache on pretty much all systems, making access super fast. Now let's imagine you have at least 50 different terrain variations (types) per column, then that is 50 * 84kB = +- 4MB. Still very small.

Data access speeds improve enormously when a dataset becomes small enough to fit into CPU L1 & L2 cache, as in RLE, rather than our RAM-sized (80MB) 2D array representation.

Generating detail

We could read from our RLE-encoded world map and temporarily copy areas into a small 2D array (for example 100x100) to create a pyramid, a small dungeon, a village etc. Once work is done, we copy these back into the RLE structure using an optimal function we've built for that purpose. What this means is that 99%+ of the entire world space is ignored as we elaborate in just that small area. As the world is gradually generated, we probably touch less than 20% of the entire world space.

The general principle in such games / worlds is...

"Don't explicitly specify (in memory) what can be safely assumed."

This is the principle which RLE (and other spatial subdivision / compression techniques) rely on.

Couple of additions to Steven's informative answer, at a technical / processing level... speaking here in a general sense of the techniques used to generate large worlds (not Terraria specifically).

Generating the initial space

Firstly, there are fast block copies (memcpy / memset in C) used to allocate the initial memory (2D array) representing the world. The entire initial world can consist of zeroes, or structures all of whose members are set to zero. This could represent either sky or ground. Let's assume the whole maps start as sky, so zeroes represent empty space. This is a very fast operation. If the entire 20 million tile map uses 4 bytes per tile, that is 80MB - large, yes - but not ridiculously so by any modern measure. Expect milliseconds or less to allocate such a large space to main memory (RAM).

Terrain can be rapidly written (& read) in a couple of ways.

memset, or series of memsets on a 2D array, most of which resides in main memory (RAM) + regular array access to read; you could even chunk this into smaller arrays (but many of them) to operate on smaller areas. Still, this is not particularly efficient; see below.
instead, the whole world may be represented as RLE columns, running from top to bottom. This makes access much faster, since an entire column of 2400 high can be represented as nothing but a min value (2 bytes), max value (2 bytes), and the tile type (typically 1 byte) = 5 bytes total. Less data = less time to access it. If you have earth (basic stone) and sky in the same column, that's 2 such entries = 10 bytes. For the whole map then, before you have started making elaborate details, that's 10 bytes * 8400 = 84000 bytes, or about 84kB. That's miniscule, and fits into CPU L1 cache on pretty much all systems, making access super fast. Now let's imagine you have at least 50 different terrain variations (types) per column, then that is 50 * 84kB = +- 4MB. Still very small.

Data access speeds improve enormously when a dataset becomes small enough to fit into CPU L1 & L2 cache, as in RLE, rather than our RAM-sized (80MB) 2D array representation.

Generating detail

We could read from our RLE-encoded world map and temporarily copy areas into a small 2D array (for example 100x100) to create a pyramid, a small dungeon, a village etc. Once work is done, we copy these back into the RLE structure using an optimal function we've built for that purpose. What this means is that 99%+ of the entire world space is ignored as we elaborate in just that small area. As the world is gradually generated, we probably touch less than 20% of the entire world space.

The general principle in such games / worlds is...

"Don't explicitly specify (in memory) what can be safely assumed."

This is the principle which RLE (and other spatial subdivision / compression techniques) rely on.

added 1 character in body

Source Link

edited Sep 9, 2021 at 12:34

Engineer

30.4k
4
76
124

Couple of additions to Steven's informative answer, at a technical / processing level... speaking here in a general sense of the techniques used to generate large worlds (not Terraria specifically).

Generating the initial space

Firstly, there are fast block copies (memcpy / memset in C) used to allocate the initial memory (2D array) representing the world. The entire initial world can consist of zeroes, or structures all of whose members are set to zero. This could represent either sky or ground. Let's assume the whole maps start as sky, so zeroes represent empty space. This is a very fast operation. If the entire 20M20MB tile map uses 4 bytes per tile, that is 80Mb80MB - large, yes - but not ridiculously so by any modern measure. Expect milliseconds or less to allocate such a large space to main memory (RAM).

Terrain can be rapidly written (& read) in a couple of ways.

memset, or series of memsets on a 2D array, most of which resides in main memory (RAM) + regular array access to read; you could even chunk this into smaller arrays (but many of them) to operate on smaller areas. Still, this is not particularly efficient; see below.
instead, the whole world may be represented as RLE columns, running from top to bottom. This makes access much faster, since an entire column of 2400 high can be represented as nothing but a min value (2 bytes), max value (2 bytes), and the tile type (typically 1 byte) = 5 bytes total. Less data = less time to access it. If you have earth (basic stone) and sky in the same column, that's 2 such entries = 10 bytes. For the whole map then, before you have started making elaborate details, that's 10 bytes * 8400 = 84000 bytes, or about 84kb84kB. That's miniscule, and fits into CPU L1 cache on pretty much all systems, making access super fast. Now let's imagine you have at least 50 different terrain variations (types) per column, then that is 50 * 84kb84kB = +- 4Mb4MB. Still very small.

Data access speeds improve enormously when a dataset becomes small enough to fit into CPU L1 & L2 cache, as in RLE, rather than our RAM-sized (80Mb80MB) 2D array representation.

Generating detail

We could read from our RLE-encoded world map and temporarily copy areas into a small 2D array (for example 100x100) to create a pyramid, a small dungeon, a village etc. Once work is done, we copy these back into the RLE structure using an optimal function we've built for that purpose. What this means is that 99%+ of the entire world space is ignored as we elaborate in just that small area. As the world is gradually generated, we probably touch less than 20% of the entire world space.

The general principle in such games / worlds is...

"Don't explicitly specify (in memory) what can be safely assumed."

This is the principle which RLE (and other spatial subdivision / compression techniques) rely on.

Couple of additions to Steven's informative answer, at a technical / processing level... speaking here in a general sense of the techniques used to generate large worlds (not Terraria specifically).

Generating the initial space

Firstly, there are fast block copies (memcpy / memset in C) used to allocate the initial memory (2D array) representing the world. The entire initial world can consist of zeroes, or structures all of whose members are set to zero. This could represent either sky or ground. Let's assume the whole maps start as sky, so zeroes represent empty space. This is a very fast operation. If the entire 20M tile map uses 4 bytes per tile, that is 80Mb - large, yes - but not ridiculously so by any modern measure. Expect milliseconds or less to allocate such a large space to main memory (RAM).

Terrain can be rapidly written (& read) in a couple of ways.

memset, or series of memsets on a 2D array, most of which resides in main memory (RAM) + regular array access to read; you could even chunk this into smaller arrays (but many of them) to operate on smaller areas. Still, this is not particularly efficient; see below.
instead, the whole world may be represented as RLE columns, running from top to bottom. This makes access much faster, since an entire column of 2400 high can be represented as nothing but a min value (2 bytes), max value (2 bytes), and the tile type (typically 1 byte) = 5 bytes total. Less data = less time to access it. If you have earth (basic stone) and sky in the same column, that's 2 such entries = 10 bytes. For the whole map then, before you have started making elaborate details, that's 10 bytes * 8400 = 84000 bytes, or about 84kb. That's miniscule, and fits into CPU L1 cache on pretty much all systems, making access super fast. Now let's imagine you have at least 50 different terrain variations (types) per column, then that is 50 * 84kb = +- 4Mb. Still very small.

Data access speeds improve enormously when a dataset becomes small enough to fit into CPU L1 & L2 cache, as in RLE, rather than our RAM-sized (80Mb) 2D array representation.

Generating detail

We could read from our RLE-encoded world map and temporarily copy areas into a small 2D array (for example 100x100) to create a pyramid, a small dungeon, a village etc. Once work is done, we copy these back into the RLE structure using an optimal function we've built for that purpose. What this means is that 99%+ of the entire world space is ignored as we elaborate in just that small area. As the world is gradually generated, we probably touch less than 20% of the entire world space.

The general principle in such games / worlds is...

"Don't explicitly specify (in memory) what can be safely assumed."

This is the principle which RLE (and other spatial subdivision / compression techniques) rely on.

Couple of additions to Steven's informative answer, at a technical / processing level... speaking here in a general sense of the techniques used to generate large worlds (not Terraria specifically).

Generating the initial space

Firstly, there are fast block copies (memcpy / memset in C) used to allocate the initial memory (2D array) representing the world. The entire initial world can consist of zeroes, or structures all of whose members are set to zero. This could represent either sky or ground. Let's assume the whole maps start as sky, so zeroes represent empty space. This is a very fast operation. If the entire 20MB tile map uses 4 bytes per tile, that is 80MB - large, yes - but not ridiculously so by any modern measure. Expect milliseconds or less to allocate such a large space to main memory (RAM).

Terrain can be rapidly written (& read) in a couple of ways.

memset, or series of memsets on a 2D array, most of which resides in main memory (RAM) + regular array access to read; you could even chunk this into smaller arrays (but many of them) to operate on smaller areas. Still, this is not particularly efficient; see below.
instead, the whole world may be represented as RLE columns, running from top to bottom. This makes access much faster, since an entire column of 2400 high can be represented as nothing but a min value (2 bytes), max value (2 bytes), and the tile type (typically 1 byte) = 5 bytes total. Less data = less time to access it. If you have earth (basic stone) and sky in the same column, that's 2 such entries = 10 bytes. For the whole map then, before you have started making elaborate details, that's 10 bytes * 8400 = 84000 bytes, or about 84kB. That's miniscule, and fits into CPU L1 cache on pretty much all systems, making access super fast. Now let's imagine you have at least 50 different terrain variations (types) per column, then that is 50 * 84kB = +- 4MB. Still very small.

Data access speeds improve enormously when a dataset becomes small enough to fit into CPU L1 & L2 cache, as in RLE, rather than our RAM-sized (80MB) 2D array representation.

Generating detail

We could read from our RLE-encoded world map and temporarily copy areas into a small 2D array (for example 100x100) to create a pyramid, a small dungeon, a village etc. Once work is done, we copy these back into the RLE structure using an optimal function we've built for that purpose. What this means is that 99%+ of the entire world space is ignored as we elaborate in just that small area. As the world is gradually generated, we probably touch less than 20% of the entire world space.

The general principle in such games / worlds is...

"Don't explicitly specify (in memory) what can be safely assumed."

This is the principle which RLE (and other spatial subdivision / compression techniques) rely on.

added 16 characters in body

Source Link

edited Sep 3, 2021 at 15:42

Engineer

30.4k
4
76
124

Couple of additions to Steven's informative answer, at a technical / processing level... speaking here in a general sense of the techniques used to generate large worlds (not Terraria specifically).

Generating the initial space

Firstly, there are fast block copies (memcpy / memset in C) used to allocate the initial memory (2D array) representing the world. The entire initial world can consist of zeroes, or structures all of whose members are set to zero. This could represent either sky or ground. Let's assume the whole maps start as sky, so zeroes represent empty space. This is a very fast operation. If the entire 20M tile map uses 4 bytes per tile, that is 80Mb - large, yes - but not ridiculously so by any modern measure. Expect milliseconds or less to allocate such a large space to main memory (RAM).

Terrain can be rapidly written (& read) in a couple of ways.

memset, or series of memsets on a 2D array, most of which resides in main memory (RAM) + regular array access to read; you could even chunk this into smaller arrays (but many of them) to operate on smaller areas. Still, this is not particularly efficient; see below.
instead, the whole world may be represented as RLE columns, running from top to bottom. This makes access much faster, since an entire column of 2400 high can be represented as nothing but a min value (2 bytes), max value (2 bytes), and the tile type (typically 1 byte) = 5 bytes total. Less data = less time to access it. If you have earth (basic stone) and sky in the same column, that's 2 such entries = 10 bytes. For the whole map then, before you have started making elaborate details, that's 10 bytes * 8400 = 84000 bytes, or about 84kb. That's miniscule, and fits into CPU L1 cache on pretty much all systems, making access super fast. Now let's imagine you have at least 50 different terrain typesvariations (types) per column, then that is 50 * 84kb = +- 4Mb. Still very small.

Data access speeds improve enormously when a dataset becomes small enough to fit into CPU L1 & L2 cache, as in RLE, rather than our RAM-sized (80Mb) 2D array representation.

Generating detail

We could read from our RLE-encoded world map and temporarily copy areas into a small 2D array (for example 100x100) to create a pyramid, a small dungeon, a village etc. Once work is done, we copy these back into the RLE structure using an optimal function we've built for that purpose. What this means is that 99%+ of the entire world space is ignored as we elaborate in just that small area. As the world is gradually generated, we probably touch less than 20% of the entire world space.

The general principle in such games / worlds is...

"Don't explicitly specify (in memory) what can be safely assumed."

This is exactly the principle which RLE (and other spatial subdivision / compression techniques) rely uponon.

Couple of additions to Steven's informative answer, at a technical / processing level... speaking here in a general sense of the techniques used to generate large worlds (not Terraria specifically).

Generating the initial space

Firstly, there are fast block copies (memcpy / memset in C) used to allocate the initial memory (2D array) representing the world. The entire initial world can consist of zeroes, or structures all of whose members are set to zero. This could represent either sky or ground. Let's assume the whole maps start as sky, so zeroes represent empty space. This is a very fast operation. If the entire 20M tile map uses 4 bytes per tile, that is 80Mb - large, yes - but not ridiculously so by any modern measure. Expect milliseconds or less to allocate such a large space to main memory (RAM).

Terrain can be rapidly written (& read) in a couple of ways.

memset, or series of memsets on a 2D array, most of which resides in main memory (RAM) + regular array access to read; you could even chunk this into smaller arrays (but many of them) to operate on smaller areas. Still, this is not particularly efficient; see below.
instead, the whole world may be represented as RLE columns, running from top to bottom. This makes access much faster, since an entire column of 2400 high can be represented as nothing but a min value (2 bytes), max value (2 bytes), and the tile type (typically 1 byte) = 5 bytes total. Less data = less time to access it. If you have earth (basic stone) and sky in the same column, that's 2 such entries = 10 bytes. For the whole map then, before you have started making elaborate details, that's 10 bytes * 8400 = 84000 bytes, or about 84kb. That's miniscule, and fits into CPU L1 cache on pretty much all systems, making access super fast. Now let's imagine you have at least 50 different terrain types per column, then that 50 * 84kb = +- 4Mb. Still very small.

Data access speeds improve enormously when a dataset becomes small enough to fit into CPU L1 & L2 cache, as in RLE, rather than our RAM-sized (80Mb) 2D array representation.

Generating detail

We could read from our RLE-encoded world map and temporarily copy areas into a small 2D array (for example 100x100) to create a pyramid, a small dungeon, a village etc. Once work is done, we copy these back into the RLE structure using an optimal function we've built for that purpose. What this means is that 99%+ of the entire world space is ignored as we elaborate in just that small area. As the world is gradually generated, we probably touch less than 20% of the entire world space.

The general principle in such games / worlds is...

"Don't explicitly specify (in memory) what can be safely assumed."

This is exactly the principle which RLE (and other spatial subdivision techniques) rely upon.

Couple of additions to Steven's informative answer, at a technical / processing level... speaking here in a general sense of the techniques used to generate large worlds (not Terraria specifically).

Generating the initial space

Firstly, there are fast block copies (memcpy / memset in C) used to allocate the initial memory (2D array) representing the world. The entire initial world can consist of zeroes, or structures all of whose members are set to zero. This could represent either sky or ground. Let's assume the whole maps start as sky, so zeroes represent empty space. This is a very fast operation. If the entire 20M tile map uses 4 bytes per tile, that is 80Mb - large, yes - but not ridiculously so by any modern measure. Expect milliseconds or less to allocate such a large space to main memory (RAM).

Terrain can be rapidly written (& read) in a couple of ways.

memset, or series of memsets on a 2D array, most of which resides in main memory (RAM) + regular array access to read; you could even chunk this into smaller arrays (but many of them) to operate on smaller areas. Still, this is not particularly efficient; see below.
instead, the whole world may be represented as RLE columns, running from top to bottom. This makes access much faster, since an entire column of 2400 high can be represented as nothing but a min value (2 bytes), max value (2 bytes), and the tile type (typically 1 byte) = 5 bytes total. Less data = less time to access it. If you have earth (basic stone) and sky in the same column, that's 2 such entries = 10 bytes. For the whole map then, before you have started making elaborate details, that's 10 bytes * 8400 = 84000 bytes, or about 84kb. That's miniscule, and fits into CPU L1 cache on pretty much all systems, making access super fast. Now let's imagine you have at least 50 different terrain variations (types) per column, then that is 50 * 84kb = +- 4Mb. Still very small.

Data access speeds improve enormously when a dataset becomes small enough to fit into CPU L1 & L2 cache, as in RLE, rather than our RAM-sized (80Mb) 2D array representation.

Generating detail

We could read from our RLE-encoded world map and temporarily copy areas into a small 2D array (for example 100x100) to create a pyramid, a small dungeon, a village etc. Once work is done, we copy these back into the RLE structure using an optimal function we've built for that purpose. What this means is that 99%+ of the entire world space is ignored as we elaborate in just that small area. As the world is gradually generated, we probably touch less than 20% of the entire world space.

The general principle in such games / worlds is...

"Don't explicitly specify (in memory) what can be safely assumed."

This is the principle which RLE (and other spatial subdivision / compression techniques) rely on.