Skip to main content

And these issues stay ththe same. In fact, your concept suffers even more from limited bandwithbandwidth. Storing character data (like on the C64) in a buffer just mitigates that issue - and even with a buffer the time isn't sufficient to read line data and resulting pixel data, thus the CPU gets cut of every 8th line (to fill the buffer).

One scan line is ~64µs with a visible length of ~52µs. A System using 1980s RAM cipschips - at least the kind that did make sense in home computers, hat a cycle time of ~350-450 ns, that means maximum access rate is two per microsecond. In a 6502 System that translates to one access for the CPU and one for Video. Or, if we clamp out the CPU we get two per microsecond.

So in a system where we let the CPU operate in paralellparallel, this allows 52 bytes to be read per scan line, or a maximum resolution of 416 b&w pixels. Or less. The amount of data read can not be more or less as timing is set by the memormemory clock. So even if we somehow are able to load a transition list for each line, atransitiona transition can only happen at byte borders. So eitehreither the resolution isreducedis reduced to 52 pixel horizontalyhorizontally, or Spritescan only be positioned on multiples of 8. Not exactly allowing asmoota smooth horizontal movement. Further sprite wdthwidth can only be sized in multiples of 8.

Using the whole memory bandwithbandwidth for video would also just double the data rate, but not solve the basic issue - and result in either a seperateseparate video memory and slow access (see TI's TMS9918) or some ZX81 alike slow mode.

To make this work on a pixel level, you need way faster memory. Roughly 8 times faster, which would be like 60 ns RAM if the CPU should run in paralellparallel (could be an 8 MHz 6502 now), or 120ns if the CPU gets stoped/fenced out. Both are note realyreally a solution for that time frame.

No matter at what end you start to look at the idea, it all comes doen wotdown to access rate - Just think, each entry in your transition table must be like 4 bytes at least (2 bytes X position and 2 bytes for sprite data start address). ThatsThat's the same time as accessing data for 32 continouscontinuous pixels. Which ofcof course, now can't be displayed, as the table entry is to be loaded. Having a long blank areas in front and after each sprite doesn't sound cool, does it?

Grandpa story: I did design a somewhat similar video system. The focus wasn't about sprites, but merging display buffers of arbitaryarbitrary size in real time ... well, thinking of it, it sounds exactly the same :))

I solved it with real wide memory (32 Bit) and special fast RAM for the transition tables and a microprogrammmicroprogram engine doing the data selection and blending (well, plus some bitbliting). In the end the project got scraped, as the hardware got way too expensive without realyreally solving the problem.

Such a system will very soon use more memory bandwithbandwidth for management data and data manipulation than the whole picture data needs for display - which in turn is the the lower time limit it would take to compose it into a buffer (as it's tehthe Memory bandwithbandwidth). And that's also the the solution to the problem - and why modern systems no longer use sprites - it's more versitaileversatile to have two buffers, one where the next picture is composed and one where the last composed one gets displayed from.

And these issues stay th same. In fact, your concept suffers even more from limited bandwith. Storing character data (like on the C64) in a buffer just mitigates that issue - and even with a buffer the time isn't sufficient to read line data and resulting pixel data, thus the CPU gets cut of every 8th line (to fill the buffer).

One scan line is ~64µs with a visible length of ~52µs. A System using 1980s RAM cips - at least the kind that did make sense in home computers, hat a cycle time of ~350-450 ns, that means maximum access rate is two per microsecond. In a 6502 System that translates to one access for the CPU and one for Video. Or, if we clamp out the CPU we get two per microsecond.

So in a system where we let the CPU operate in paralell, this allows 52 bytes to be read per scan line, or a maximum resolution of 416 b&w pixels. Or less. The amount of data read can not be more or less as timing is set by the memor clock. So even if we somehow are able to load a transition list for each line, atransition can only happen at byte borders. So eitehr the resolution isreduced to 52 pixel horizontaly, or Spritescan only be positioned on multiples of 8. Not exactly allowing asmoot horizontal movement. Further sprite wdth can only be sized in multiples of 8.

Using the whole memory bandwith for video would also just double the data rate, but not solve the basic issue - and result in either a seperate video memory and slow access (see TI's TMS9918) or some ZX81 alike slow mode.

To make this work on a pixel level, you need way faster memory. Roughly 8 times faster, which would be like 60 ns RAM if the CPU should run in paralell (could be an 8 MHz 6502 now), or 120ns if the CPU gets stoped/fenced out. Both are note realy a solution for that time frame.

No matter at what end you start to look at the idea, it all comes doen wot access rate - Just think, each entry in your transition table must be like 4 bytes at least (2 bytes X position and 2 bytes for sprite data start address). Thats the same time as accessing data for 32 continous pixels. Which ofc, now can't be displayed, as the table entry is to be loaded. Having a long blank areas in front and after each sprite doesn't sound cool, does it?

Grandpa story: I did design a somewhat similar video system. The focus wasn't about sprites, but merging display buffers of arbitary size in real time ... well, thinking of it, it sounds exactly the same :))

I solved it with real wide memory (32 Bit) and special fast RAM for the transition tables and a microprogramm engine doing the data selection and blending (well, plus some bitbliting). In the end the project got scraped, as the hardware got way too expensive without realy solving the problem.

Such a system will very soon use more memory bandwith for management data and data manipulation than the whole picture data needs for display - which in turn is the the lower time limit it would take to compose it into a buffer (as it's teh Memory bandwith). And that's also the the solution to the problem - and why modern systems no longer use sprites - it's more versitaile to have two buffers, one where the next picture is composed and one where the last composed one gets displayed from.

And these issues stay the same. In fact, your concept suffers even more from limited bandwidth. Storing character data (like on the C64) in a buffer just mitigates that issue - and even with a buffer the time isn't sufficient to read line data and resulting pixel data, thus the CPU gets cut of every 8th line (to fill the buffer).

One scan line is ~64µs with a visible length of ~52µs. A System using 1980s RAM chips - at least the kind that did make sense in home computers, hat a cycle time of ~350-450 ns, that means maximum access rate is two per microsecond. In a 6502 System that translates to one access for the CPU and one for Video. Or, if we clamp out the CPU we get two per microsecond.

So in a system where we let the CPU operate in parallel, this allows 52 bytes to be read per scan line, or a maximum resolution of 416 b&w pixels. Or less. The amount of data read can not be more or less as timing is set by the memory clock. So even if we somehow are able to load a transition list for each line, a transition can only happen at byte borders. So either the resolution is reduced to 52 pixel horizontally, or Spritescan only be positioned on multiples of 8. Not exactly allowing a smooth horizontal movement. Further sprite width can only be sized in multiples of 8.

Using the whole memory bandwidth for video would also just double the data rate, but not solve the basic issue - and result in either a separate video memory and slow access (see TI's TMS9918) or some ZX81 alike slow mode.

To make this work on a pixel level, you need way faster memory. Roughly 8 times faster, which would be like 60 ns RAM if the CPU should run in parallel (could be an 8 MHz 6502 now), or 120ns if the CPU gets stoped/fenced out. Both are note really a solution for that time frame.

No matter at what end you start to look at the idea, it all comes down to access rate - Just think, each entry in your transition table must be like 4 bytes at least (2 bytes X position and 2 bytes for sprite data start address). That's the same time as accessing data for 32 continuous pixels. Which of course, now can't be displayed, as the table entry is to be loaded. Having a long blank areas in front and after each sprite doesn't sound cool, does it?

Grandpa story: I did design a somewhat similar video system. The focus wasn't about sprites, but merging display buffers of arbitrary size in real time ... well, thinking of it, it sounds exactly the same :))

I solved it with real wide memory (32 Bit) and special fast RAM for the transition tables and a microprogram engine doing the data selection and blending (well, plus some bitbliting). In the end the project got scraped, as the hardware got way too expensive without really solving the problem.

Such a system will very soon use more memory bandwidth for management data and data manipulation than the whole picture data needs for display - which in turn is the the lower time limit it would take to compose it into a buffer (as it's the Memory bandwidth). And that's also the the solution to the problem - and why modern systems no longer use sprites - it's more versatile to have two buffers, one where the next picture is composed and one where the last composed one gets displayed from.

Source Link
Raffzahn
  • 249.4k
  • 23
  • 722
  • 1k

I asked in Limiting factor on sprite sizes what the limiting resource is, and there were some good explanations about how memory bandwidth and the size of on-chip memory to store sprite data are both issues.

And these issues stay th same. In fact, your concept suffers even more from limited bandwith. Storing character data (like on the C64) in a buffer just mitigates that issue - and even with a buffer the time isn't sufficient to read line data and resulting pixel data, thus the CPU gets cut of every 8th line (to fill the buffer).

One scan line is ~64µs with a visible length of ~52µs. A System using 1980s RAM cips - at least the kind that did make sense in home computers, hat a cycle time of ~350-450 ns, that means maximum access rate is two per microsecond. In a 6502 System that translates to one access for the CPU and one for Video. Or, if we clamp out the CPU we get two per microsecond.

So in a system where we let the CPU operate in paralell, this allows 52 bytes to be read per scan line, or a maximum resolution of 416 b&w pixels. Or less. The amount of data read can not be more or less as timing is set by the memor clock. So even if we somehow are able to load a transition list for each line, atransition can only happen at byte borders. So eitehr the resolution isreduced to 52 pixel horizontaly, or Spritescan only be positioned on multiples of 8. Not exactly allowing asmoot horizontal movement. Further sprite wdth can only be sized in multiples of 8.

Using the whole memory bandwith for video would also just double the data rate, but not solve the basic issue - and result in either a seperate video memory and slow access (see TI's TMS9918) or some ZX81 alike slow mode.

To make this work on a pixel level, you need way faster memory. Roughly 8 times faster, which would be like 60 ns RAM if the CPU should run in paralell (could be an 8 MHz 6502 now), or 120ns if the CPU gets stoped/fenced out. Both are note realy a solution for that time frame.

I'd say the shortfall of the idea is, that you might have been thinking in pixel, where access is in bytes - and access time is quite limited.

No matter at what end you start to look at the idea, it all comes doen wot access rate - Just think, each entry in your transition table must be like 4 bytes at least (2 bytes X position and 2 bytes for sprite data start address). Thats the same time as accessing data for 32 continous pixels. Which ofc, now can't be displayed, as the table entry is to be loaded. Having a long blank areas in front and after each sprite doesn't sound cool, does it?

And no, there is no time to load it before each line - after all, there are only 12µs or 12 byte loading times. That's barely enough to load the first transition data.

And so on.


Grandpa story: I did design a somewhat similar video system. The focus wasn't about sprites, but merging display buffers of arbitary size in real time ... well, thinking of it, it sounds exactly the same :))

I solved it with real wide memory (32 Bit) and special fast RAM for the transition tables and a microprogramm engine doing the data selection and blending (well, plus some bitbliting). In the end the project got scraped, as the hardware got way too expensive without realy solving the problem.


Such a system will very soon use more memory bandwith for management data and data manipulation than the whole picture data needs for display - which in turn is the the lower time limit it would take to compose it into a buffer (as it's teh Memory bandwith). And that's also the the solution to the problem - and why modern systems no longer use sprites - it's more versitaile to have two buffers, one where the next picture is composed and one where the last composed one gets displayed from.