How slow was the 6502 BASIC compared to Assembly

Question

Imagine a modern computer, where let's say Python is a high level programming language and needs to be interpreted in order to execute a piece of code. You could write some code in C, compile it, which will be much closer to the actual hardware and therefore runs faster.

Now, if we take a C64, if I write a piece of code in BASIC, would it be much slower than Assembly? If so, is there any comparison I could make? Is it even possible to quantify it this way?

What do you consider "close to the hardware?" On a C64 or Apple ][ you could directly PEEK at and POKE into address anywhere in RAM. You were basically doing pointer arithmetic and writing to addresses directly. But your programs were stored as strings which were interpreted, so it was much slower than assembly. — user1118321
– user1118321, Commented Sep 28, 2019 at 23:03
What made MS-Basic (both Apple II and C64) exceptionally slow was all the handling of numbers was done through its floating point routines. So, as soon you had a loop running over a counter in- or decrementing by 1 each iteration – the most common case –, the equivalent assembler code was about 1000 times faster. — Janka
– Janka, Commented Sep 29, 2019 at 0:03
@user1118321: Slight correction: It was common for 1980s BASIC implementations to store programs in memory in a binary "tokenized" format rather than raw ASCII strings. Still, any such intermediate language was still interpreted, slowly. — dan04
– dan04, Commented Sep 29, 2019 at 1:31
@Janka And that's why you had Integer variables (with % suffix) in Applesoft (Apple II Basic) - after all, the first Apple II Basic ("Integer Basic") only used integer variables in the first place. Of course you had to use them, or your loops would still be slow. — dirkt
– dirkt, Commented Sep 29, 2019 at 4:52
C64 BASIC isn't a particularly stellar implementation. While interpreted BASIC will always be slower than assembly language, other systems — particularly the BBC Micro — had far more efficient interpreters. — scruss
– scruss, Commented Sep 30, 2019 at 14:45

cjs · Accepted Answer · 2019-09-30 08:18:39Z

Yes, BASIC is much slower than assembly for many operations. For an easy example, try out this program on a Commodore 64 or emulator:

for i = 1024 to 1984 : poke i,peek(i) or 128 : next

You will see each character on the screen reverse, row by row, over the course of ten seconds. By contrast, the exact same routine in machine language inverts the entire screen in a fraction of a second; there's almost no perceptable gap between the first character and last character being inverted. (The source and a BASIC loader for it are appended below, if you want to see how it works or run it yourself.)

The two main issues that make it much slower are that each line of BASIC is read and interpreted before it's executed, and the data formats used by BASIC often have much higher overhead than the wider variety of formats one can use in machine language.

In some cases the latter is due to BASIC not using the most efficient formats it has available. For example, BASIC always uses floating point for the index of a for loop rather than having extra code to determine whether it could use integer variables instead. Thus, adding one to i in the code above ends up executing machine-language procedures to copy several bytes of data to the FAC (floating point accumator), do the floating point addition, and copy it back out. This is many dozens of instructions, whereas a loop that meets the restrictions that allow integers to be used (as in the machine-language routine below) can do its math in a small handful of instructions.

In other cases, BASIC just doesn't support at all the kind of techniques and formats you can use in assembler. As Harper points out in a comment below, unrolling the loop in the following assembly routine would save some arithmetic and several memory lookups, probably doubling the speed of the routine. That kind of optimization is something that assembler programmers can do in the right circumstances, and you can't really work at the level at all in BASIC.

Appendix

The following is a machine language routine to invert the screen on a Commodore 64 in a way similar to how it was done in BASIC above. Note that this is deliberately not optimized; it's written instead with an eye towards clarity and generality. (For example, a simple change could make this update 32 KB, rather than just 1 KB.)

All numbers in the listing are in hexadecimal (base 16). The # in front of some of them means to load that actual number itself into the A or Y register; otherwise it's loading data from the address in memory specified by that number. In the case of the [addr],Y references, it's loading a 16-bit address from addr, adding the Y register to that value, and that determines the memory location of the load or store. We need to do this because the Y register is only 8 bits, holding values up to only FF (256 decimal), so we need to count through 256 four times to to read and write all 1024 screen addresses. (Actually, there are only 960 displayed on the screen, but we do 4×256 to keep the code simple.)

00FC addr .equ 00fc ; unused zero-page location C000 A9 00 invscr: lda #00 ; screen RAM start low byte C002 85 FC sta addr ; unused zero-page location C004 A9 04 lda #04 ; screen RAM start high byte C006 85 FD sta addr+1 ; unused zero-page location C008 A0 00 nextpage: ldy #00 ; set 8-bit register Y to 0 C00A B1 FC nextchar: lda [addr],Y ; load character from addr + Y C00C 09 80 ora #80 ; set bit 7 to make it inverse C00E 91 FC sta [addr],y ; store modified character C010 C8 iny ; increment Y C011 D0 F7 bne nextchar ; branch back if y != 0 C013 E6 FD inc addr+1 ; increment 16-bit screen address by 256 C015 A5 FD lda addr+1 C017 C9 08 cmp #08 ; reached end of screen? C019 D0 ED bne nextpage C01B 60 rts

And here's a BASIC program that will load the routine; you can run it after that with sys 49152.

10 loc=49152 : rem store the routine at $c000 20 read v: if v = -1 then end 30 poke loc,v : loc = loc + 1 : goto 20 50 data 169,0,133,252,169,4,133,253 60 data 160,0,177,252,9,128,145,252,200,208,247 70 data 230,253,165,253,201,8,208,237,96 90 data -1

And that's not even a particularly fast way to invert the screen. I can see at least a couple of optimisations. — JeremyP
– JeremyP, Commented Sep 29, 2019 at 17:55
@JeremyP Oh yes, there's a lot, saving clocks as well as size - startign with elimination of unneeded execution. But then again, this isn't a programming context, but an example to prove a point (maybe a bit to excessive by including a loader). It serves its purpose. — Raffzahn
– Raffzahn, Commented Sep 29, 2019 at 18:17
Yeah, given that there's only four pages, I make one pass 00-FF and repeat the LDA/ORA/STA code 4x, with absolute,y instead of (indirect),y. But any implementation will be beyond seeing. — Harper - Reinstate Monica
– Harper - Reinstate Monica, Commented Sep 29, 2019 at 18:27
@CurtJ.Sampson I don't think including the loader is excessive, it means that anybody with a C64 emulator can type it in and try it. — JeremyP
– JeremyP, Commented Sep 30, 2019 at 9:27
I find it curious in retrospect that nobody published a short machine-language routine equivalent to the Macintosh "StuffHex" toolbox call, thus allowing a VIC-20 programmer to write e.g. SYSSQ,7168,"3C42A581A599423C" to store a smiley-face character to RAM at address 7168. A lot of programs were slow to start up as a result of READing and POKEing bytes separately; a stuffhex routine would have made code faster and more compact. — supercat
– supercat, Commented Sep 30, 2019 at 14:59

Chromatix · Accepted Answer · 2019-09-29 00:36:37Z

16

Most implementations of BASIC for 8-bit home computers were interpreters, and in that sense they're similar to the standard versions of Python. You could typically expect simple programs to run 100 times slower in BASIC than in assembly of ordinary quality.

However, it would normally take much less time to write that program in BASIC than in assembly. For that reason, some commercial games were still written in BASIC, if the full performance of the machine wasn't needed and thus the cost of production mattered more.

answered Sep 29, 2019 at 0:36

Chromatix

17.2k1 gold badge55 silver badges71 bronze badges

4

Another benefit of BASIC was that the bytecode was pretty compact. Most programs would be 2x-5x smaller when written in BASIC vs. assembler, which was quite a big difference with the small RAMs back then.

jpa
– jpa

2019-09-29 15:45:39 +00:00
Commented Sep 29, 2019 at 15:45
3

@jpa But one could also use bytecode with assembly, by writing a little bytecode interpreter in assembly. This technique was regularly used when compact size was more important than speed. I can see some BASIC programs being 2-5x more compact, but not most of them, especially since the BASIC routines were usually available for use by assembler programs as well. (E.g., JSR $ABF9 to use the BASIC INPUT command from assembly.)

cjs
– cjs

2019-09-29 17:08:09 +00:00
Commented Sep 29, 2019 at 17:08
@CurtJ.Sampson True enough. I guess my assembly programs were always on the beginner level.

jpa
– jpa

2019-09-29 19:21:02 +00:00
Commented Sep 29, 2019 at 19:21
And note that for most programs performance is of little concern for most of the program. Back in the old days I wrote various programs that used assembly for small things done frequently even though the rest of the program was in BASIC.

Loren Pechtel
– Loren Pechtel

2019-10-01 04:31:01 +00:00
Commented Oct 1, 2019 at 4:31
3

@Loren Which funnily enough is how much of Python code works today as well. Write most of the code in Python and call out to C code for the performance sensitive parts (i.e. usually the Python runtime or specific libraries such as numpy).

Voo
– Voo

2019-10-01 12:12:17 +00:00
Commented Oct 1, 2019 at 12:12

Add a comment |

Raffzahn · Accepted Answer · 2019-09-29 11:20:51Z

if I write a piece of code in Basic, would it be much slower than Assembly?

Well, it's interpreted. So even though it's a simple language, it'll never reach native speed - not even coming close.

If so, is there any comparison I could make?

For most parts like with Python vs. Assembler on a PC (*1). Except of course, BASIC is a way less comfortable language than Python, with way less build in functionality, so it usually ends up with more source code to interpret to do the same job. And it's the interpreted part making it slow (*2).

Is it even possible to quantify it this way?

Simply no. Any quantification can only be done in relation to a concrete task to be done, as it relates much to

Interpreter used
Algorithms used
Functions used
Task selected
Implementation tehreof.

Real world examples will range between BASIC being 100 times slower (e.g. when doing bit level graphics) to almost as fast as Assembly (Like with only FP-Math). Trying to tie it to single constructions (and examples) in either language will be like judging a natural language by a single word - useless for a generalized observation. Not to mention 'good' BASIC coding vs. 'bad' Assembly.

Just picking a CPU or its assembler won't give any relation - except that BASIC will never be faster.

*1 - Assembler should be at least as fast as a C binary for the same problem.

*2 - Ofc, assumed the used functions/libraries are sufficient well implemented.

Just between versions of BASIC there were huge differences. BBC Basic on the BBC B was in places 10-20 times faster than MS Basic on a Commodore or Apple. — tofro
– tofro, Commented Sep 29, 2019 at 10:30
Additional note for "*1": Assembler written to be well understood by humans has a good chance to be slower than the same algorithm written in C and compiled by a good compiler. Even if the C source is written in a good style. — the busybee
– the busybee, Commented Sep 29, 2019 at 13:29
@thebusybee That, I'd say, still needs some support added, as I can not see any reason why Assembly code would be slowed down due being readable. — Raffzahn
– Raffzahn, Commented Sep 29, 2019 at 14:38
Well, because assembly programmers tend to write not so much optimized code while trying to keep it understandable. OTH a compiler can analyze the life time of variables, fold and extract constants, unroll loops, inline functions, and so much more. The resulting code still does the same but is not straightforwardly understandable for humans. In this sense a compiler is an extremly trained expert with a perfect memory, most disciplined, and with the knowledge of many years of professional work. I still have to see such a human programmer. ;-) — the busybee
– the busybee, Commented Sep 29, 2019 at 15:59
Well, I was thinking for example about re-use of registers, selection of the best combination (variable <-> register), including application-wide "register colouring", re-ordering of instructions, and so on. Brought to the max this will lower the understandability of code, in my experience. Introducing a lot of macros will generate a steeper learning curve for new team members. All of this is hidden behind the scene with a good compiler. But we are discussing opinions, based on experience. And I have my share by 30+ years in assembler and C for embedded and safety critical systems. — the busybee
– the busybee, Commented Sep 29, 2019 at 18:31

Community · Accepted Answer · 2020-06-18 08:29:59Z

[Modern Python compared to C; C64 BASIC compared to assembly.]

is there any comparison I could make? Is it even possible to quantify it this way?

Yes, you have the right idea. That is exactly the comparison you can make.

BASIC was easier to write (don't underestimate the value of that), but "slower" to "dreadfully slower", depending on what you were doing.

Speed: BASIC vs assembler

Obviously, everything is worlds slower. Especially operations the CPU just can't do, like divide or compute a cosine. But there are more gotchas.

the unpredictability of the duration of certain operations. For instance sin(x) was very quick if x=90, otherwise not.
the dreaded "Garbage collection", where the system runs out of clear memory space to allocate for variable-length records like strings, and "defrags" RAM by repacking all existing strings to the bottom of free RAM. I have seen garbage collections take 3 seconds. That's an eternity in game time.
Even with tokenization (go Woz!), the language isn't very compact. That mattered on ROM cartridges, where space was money.
Outside of Apple Integer Basic, BASICs had bugs. And a future revision of the system could add bugs. Assembler had precious few and they were well-known.
If the CPU had to do extremely timing-sensitive tasks, like soft-listen to serial ports, modulate the cassette deck, make sound without a sound chip, or especially beam-riding for sprite manipulation such as in games, it was impossible in BASIC and you had to go assembler. Even the entry/exit from assembler to BASIC was too time-consuming to use in action games, unless you did it once per field.

Our Speed metaphor: modern vs RC

Because they have loads of RAM and code space, code optimization is much, much better today. They're still never as fast as C, but they do much better than you'd expect. While you could always count on BASIC to always be balefully slower than assembly. Advantage: modern.
modern processors include complex math. This means that all the gory complexities of, say, a double-wide floating point divide are both done by hardware at the same speed, whether we're calling from Python or C. Older processors did complex math in software, which was a heck of a motivation for assembler programs to bypass the need for complex math. In BASIC you would ask for a cosine and wait, wait, wait... In assembler you wouldn't even try cosine, you'd just find a way to make a lookup table work. Advantage: RC.

As for quantifying, that is a wet noodle.

A safe bet is it will be 10-100 times slower. It's very difficult to quantify more precisely, unless you have a lot of experience timing both sides' operations.

And in the pro world, that's exactly what we did. We put a quickie version up in BASIC and assembler, wrapped iteration code around it, and got out the stopwatch. When we were serious, we also timed how long the iteration code itself took with no payload.

hotpaw2 · Accepted Answer · 2019-10-02 05:47:53Z

The Byte Sieve benchmark, in Applesoft Basic took 2806 seconds, according to Byte Magazine, September 1981 issue, page 192. Byte Sieve in 6502 Assembly language took 13.9 seconds, according to Byte Magazine, January 1983 issue, page 292.

That's a factor of 200X between a tokenizing Basic interpreter and hand-coded assembly for the 6502.

200X is in about the right ballpark, as various other Basic interpreters that I've benchmarked (including my own Chipmunk Basic) on a bunch of different CPU's range from 40X to 500X slower than Asm or C code running the equivalent algorithm. Modern REPL language systems run faster by including a JIT compiler to machine code, or tokens for a fast VM.

I read the article and program in BASIC and 6502 for C64 using VICE emulator. — alvalongo
– alvalongo, Commented Oct 29, 2019 at 22:20

trash-80 · Accepted Answer · 2019-09-30 03:24:54Z

I've had experience with the TRS-80, and there were three programs I wanted to do that I simply could not get good performance in BASIC. All three programs were dealing with the screen.

The first program was to fill the screen with a single arbitrary character (if you used space, it's the same as clear screen, else I could fill it with whatever character I wanted). The naïve way is is to just make a for loop for all 1024 characters and print them. This actually took around 6-8 seconds to fill the screen.

A faster way is possibly to generate a fairly long string and print them in quick succession. I think this still would take 2 seconds to fill.

This was not acceptable, so I decided to program them in assembly/machine code. The resultant code I recall poking into a string variable (very common at the time to prevent memory clashes as there is no mmu) and the screen filled so fast that I could not time it. From memory I did do some repeated filling and timed those, and recall them to be around 1/20 of a second or faster to fill the display.

The next problem I wanted to do was fill the screen with RANDOM characters. This took almost a minute to fill the display, if I remember correctly and the string optimization doesn't help. However I did cheat in machine code and used the refresh counter as the RNG which may not be as "random" as the PRNG, but the machine code was fast enough to make my screen look like snow when called in quick succession, probably also around 1/20 of a second to fill.

The third problem is that despite the poor graphics of the TRS-80 with its 6-block, I wanted to save/restore "bitmap" graphics. If I recall correctly, the BASIC program that peek/poked into video memory and wrote/read from disk took upwards 5 minutes to save 1KB of memory handled byte at a time. OS calls were expensive on the TRS-80 I suppose. I did not try string optimization at the time to save on OS calls, but did write the code in assembly... which took about 5-10 seconds to complete.

Today, these pieces of code sequences are kind of childish and "simplistic". Code sequences to hash a key to access a piece of dynamic data in memory would take eons to code in assembly not to mention how error prone it was (I've crashed my TRS-80 many times getting the machine code right.) Most of the time coding in assembly simply would not give you the time to market, and sometimes won't even give you performance - keep in mind a well coded interpreted language program with the proper support libraries that well implements the functions you need will give you fairly good performance with no risk of buffer overflows.

Note that the function library bonus didn't exist as much then as it does today. BASIC pretty much maps one to one and you don't gain much speed improvement calling its functions - though one can say that if you had a program that had to continually run floating point transcendental functions (sin, cos, x^y, etc.) you may not see as much of a difference between assembly and BASIC as computing floating point natural logarithms in assembly is not much faster than calling ln() in BASIC.

Now today imagine if BASIC also had matrix multiplies, hash tables functionality, etc. as functions then - Now you can see the performance gap decrease between BASIC and assembly as more compute time is spent in the complex functions instead of parsing code.

And this issue still applies today when writing python, perl, java. If you want performance, no excuse not to use the dedicated functions else you'll be back to BASIC speed problems just like if you were to write x^y with a for loop instead of just using the builtin.

Stack Exchange Network

How slow was the 6502 BASIC compared to Assembly

6 Answers 6

Speed: BASIC vs assembler

Our Speed metaphor: modern vs RC

As for quantifying, that is a wet noodle.

You must log in to answer this question.

Linked

Hot Network Questions

How slow was the 6502 BASIC compared to Assembly

6 Answers 6

Speed: BASIC vs assembler

Our Speed metaphor: modern vs RC

As for quantifying, that is a wet noodle.

You must log in to answer this question.

Linked

Related

Hot Network Questions