16

While I was viewing the file AS.EXE, I stumbled upon next list of instruction mnemonics for the 8086 processor:

ja jb jc je jg jl in jo jp js jz aaa daa aad adc add dec bge aam jae jbe lea clc bhi ble cmc cld bne aas jge sbb beq das cli jna neg jnb inc jhi jnc jle esc cbw blo bgt jne cwd jpe jng jeq sal cmp rcl blt nil lds div les jnl jlo jgt sar rep rcr jno jmp jnp hlt jpo jlt sub stc std ret jns int rol nop mul pop sti mov jnz ror out lahf call jnae jnbe sahf jnge bhis lock jnle xchg scas repe idiv jhis blos lods cmps iret wait popf imul jlos xlat loop into jcxz test push repz movs stos buncd scasb juncd lodsb cmpsb repne xlatb loope scasw pushf movsb lodsw cmpsw stosb repnz loopz movsw stosw loopne loopnz 

What immediately struck me is the elevated number of mnemonics.
After a close inspection I compiled next table for the supplemental mnemonics:

Enc Intel mnemonics Manx mnemonics Description
70h jo
71h jno
72h jb, jnae, jc blo, jlo lower
73h jnb, jae, jnc bhis, jhis higher or same
74h je, jz beq, jeq equal
75h jne, jnz bne not equal
76h jbe, jna blos, jlos lower or same
77h jnbe, ja bhi, jhi higher
78h js
79h jns
7Ah jp, jpe
7Bh jnp, jpo
7Ch jl, jnge blt, jlt less than
7Dh jnl, jge bge greater or equal
7Eh jle, jng ble less or equal
7Fh jnle, jg bgt, jgt greater than
EBh jmp buncd, juncd unconditionally
nil

Now I wonder why the author(s) added so many extra mnemonics. I understand that one would do this in order to ease the transition for people coming from another architecture like eg. 68k that does use beq, bne, bhi, blt, bge, ble, and bgt.

  • What I don't quite get is the addition of the longer-than-necessary buncd and juncd. (They could easily have picked bra instead.) Have there ever been assemblers around that already used buncd and juncd?

  • What is nil supposed to accomplish? It sits among the instructions but the assembler doesn't insert any code for it. Isn't prefixing a line with a semicolon much easier if we want to omit what is on a line or suppress anything from getting encoded?

The AS.EXE file (44448 bytes) bears the following copyright notice:

Copyright (C) 1984 by Manx Software Systems.
8086 Assembler Vers. 1.06D

The file came with an 8088 computer some 35 years ago and I never had a manual for it. Not surprising I barely used the software, except for today in order to verify the above mentioned table.

2 Answers 2

20

The mysterious AS.EXE is apparently the assembler distributed with the Aztec C compiler toolchain. Most of these extra mnemonics are actually pseudo-instructions providing the ability to perform conditional jumps with word displacement, as encodings for such conditional jumps did not exist before the 386. I am much less sure of the motivation for nil and buncd. The former might presumably be useful as a placeholder for situations where an instruction is expected by the grammar, but none is actually intended by the programmer (something like the Python pass keyword); but then I am somewhat doubtful that any such situations ever actually arise. The latter mnemonic, though, seems entirely redundant.

This is what the manual for version 4.10c of the toolchain has to say (page 31 in the assembler manual, numbered 206 in the PDF):

Most of the special instructions supported by as are conditional branch instructions, whose target location can be anywhere in the current code segment The standard conditional jump instructions require that the target address be inside a small interval of code centered around the jump instruction.

When a conditional branch instruction is assembled, the equivalent jump instruction will be generated if the target of the branch can be reached by the jump instruction. Otherwise, the assembler will generate two hardware instructions for the branch: an unconditional jump to the target (which can access any location in the code segment), preceded by a conditional jump around the unconditional jump. This preceding conditional jump tests for a condition that is the opposite of the one specified by the branch instruction.

The special branch instructions and their corresponding jump instructions are:

branch jump
beq je
bne jne
blt jl
ble jle
bgt jg
bge jge
blo jb
blos jbe
bhi ja
bhis jae

The other special instructions supported by as are nil, which does nothing and which generates no code; and xlatb, which is the same as the standard xlat instruction, but which doesn’t require an operand, and which assumes that the translate table is in the segment pointed at by the DS segment register (ie, it won’t automatically output a segment override prefix).

I was not able to locate the manual for the 1.06d version; not for the 8086 target at least.

11
  • 3
    Nice find! It makes sense to have that branch lot escape the short distance. Commented Sep 28, 2024 at 23:25
  • 1
    This assembler is probably the backend for the C compiler, and so an instruction like nil makes sense. The compiler might need a keyword in place of any special character where no machine shall be generated. Commented Sep 29, 2024 at 7:03
  • 3
    @user3840170 Because in some stage the output to be generated is optimized to nothing, but the next stage requires something. Compilers can be astonishingly simple or even primitive. Commented Sep 29, 2024 at 10:41
  • 2
    @user3840170 you may need something to put a label on Commented Sep 29, 2024 at 12:41
  • 1
    @user3840170 Such old compilers go strange routes. Perhaps it cannot handle an empty instruction, not even as a comment string, just because it needs an actual mnemonic. -- Anyway, we are all speculating here, so let's stop the discussion. Without a developer of the software, or at least the source code, we will never know. -- It could as well be an equivalent to nop, simply on the pseudo-mnemonic meta level. But again, this is speculative. Commented Sep 29, 2024 at 13:47
15

Well, Manx Software Systems you say? Wouldn't that be the Assembler used as part of Manx's Aztec C? Aztec C was not only one of the very early micro computer C compilers but also a cross platform tool as it supported 6502 (Apple II and others), 8080 and Z80 (CP/M) as well as, a bit later 8086 and 68000. The 8086 version was available for CP/M-86 as well as PC/MS-DOS. They were quite successful until MS also offered a C compiler and Apple went for other suppliers.

Now an Assembler intended as backend for a C-compiler can gain a lot from having synonyms matching what the compiler likes to produce. Especially as the names Intel used for their mnemonics aren't as great as they may have thought (*1). Also, coming from 6502 and x80 style branch/jump names may have helped porting parts of either - not to mention the hassles of conditional jumps past +/-128 bytes.

NIL may simply be a placeholder for code where no code is needed. Like purely to mark labels, or when it got optimized away. Depending on how the C compiler handles its intermediate objects, a token that simply does nothing might be the way to distinguish it from a non-existent element, so attributes for it (like address or sequence) can still be managed.

There is even an "Official Aztec C Museum" holding many of the original documentation and software versions. You might want to have a look there.


*1 - I just recently discovered a systematic issue (my own fault) with JL vs. JB in a 40 year old program after I finally understood what Intel did there :(

1
  • 9
    The insight that this assembler is intended to receive source generated by a compiler, and not a human is important. These instructions are exactly what’s needed to abstract away the CPUs limitations until enough is known to actually emit the proper machine code. Kneejerk reaction would be to implement them as macros - this is probably the most efficient way to do it 🙂 Commented Sep 29, 2024 at 7:49

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.