2

As far as I know, 8086 instruction has 3-types of instruction set about data moving:

  1. memory to register
  2. register to memory
  3. register to register

However, yesterday I found some instruction set such as movsb and outsb,
In usage of those instructions, memory to memory(M2M) operation was possible!

In this time, I'm curious about why M2M instructions are exist.
I found there's lots of restriction of using them.

  • Those instructions consume lot of CPU cycles,
  • Those instructions require of using segment register as operand.

And those M2M operations are also runnable with combining above 1,2,3 types of instructions.

Question:
It's hard for me to agree with existence of those M2M instructions.
Are those only exist for making shorter assembly codes?

6
  • 3
    The architects had reasoned, that some common higher level functions can be supported in hardware. So movsb is there to implement a memcpy and friends and outsb is very helpful when doing IO to dumb devices lacking sufficient helper hardware (which was the usual case in those days of old). Of course, it was not an original idea or something - before the advent of RISC movement this sort of instructions were rather common everywhere (apart from really small 8 bit "micros"). Commented Apr 8, 2019 at 7:34
  • 1
    Note only does the 8088/8086 have a memory move, the 8237 DMA chip from the days of the early PC's includes a memory to memory transfer feature, using channel 0 for source, channel 1 for destination. Commented Apr 8, 2019 at 10:41
  • @rcgldr Oh wow, I didn't know that. Did anybody actually use this feature? Commented Apr 8, 2019 at 15:17
  • @fuz - I didn't use it. I seem to recall it being tested at a company I worked for, maybe benchmarking dma memory moves and/or cpu memory moves, on various desktops back in the 1980s, to get an idea of memory bandwidth. Commented Apr 8, 2019 at 15:55
  • 1
    All memory operands have a segment register, whether explicitly specified or implicit. Commented Apr 9, 2019 at 0:00

1 Answer 1

4

The movs* and cmps* instructions are quite handy as they let you perform such common tasks as copying data and comparing data.

The ins* and outs* are similar to movs* in nature, they simply move data between memory and I/O devices. They are especially helpful for reading/writing to a disk in complete sectors (typically 512 bytes). Of course, DMAs obliterate these since DMA-based I/O is even more efficient, but back in the day they weren't as common as they are today.

Simulating these instructions (especially their repeated forms (look up the rep prefix)) would've required more code and would've been slower. Hence their existence.

Btw, the xchg instruction and any other read-modify-write instruction (e.g. add) with the destination in memory are also effectively memory-to-memory instructions. Not all CPUs have these, many mainly offer instructions that either read from memory or write to memory but not both (the exception would be the instructions that are used to implement exclusive/atomic access to memory, think xchg, xadd, cmpxchg8/16). CPUs with such instruction sets belong to so-called load-store architectures.

Also, the push and pop instructions may have their explicit operand designate a memory location. That's another form of memory-to-memory instructions.

As for segments, nearly all instructions that read or write memory involve segments (some system instructions work differently), so the segment management and overhead is not something you could somehow avoid if you decided not to use the instructions you're mentioning and opt for some other instructions instead.

Sign up to request clarification or add additional context in comments.

4 Comments

Related: What x86 instructions take two (or more) memory operands? lists instructions with 2 separate memory operands, not RMW of the same operand.
A few load-store ISAs do also have an atomic swap (e.g. 32-bit ARM's deprecated swp instruction) and/or CAS, but the majority of RISC ISAs only support Load-Linked / Store-Conditional so lock add translates to a retry loop. (Only compare_exchange_weak doesn't require a loop, because it's allowed to fail on contention, not just data differences).
LL/SC has advantages for interrupt latency, because it basically makes an atomic transaction abortable. You can implement any atomic operation (on one location), but you can do the same thing with a CAS retry loop. LL/SC does make it easier to avoid the ABA problem... Anyway, getting off topic here. LL/SC is still 2 separate instructions, each of which only load or only store. But they are linked together to commit as a RMW transaction, so yes they solve the same problem as 8086 xchg or later x86 lock cmpxchg / lock add / lock xadd / etc.
Segments: the string instructions write es:di, making it easy to work with 2 segments at once without bloating the code with segment-override prefixes. Code-size was critical on 8086 for performance.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.