Rendered at 08:26:15 GMT+0000 (Coordinated Universal Time) with Cloudflare Workers.
zorobo 13 hours ago [-]
I could hold the whole 6502 instruction set (and their cycles) in my mind while programming, it was that simple.
I acquired a Z-80 softcard for my Apple ][ (for trying out CPM) and was flabbergasted by the expanded register set, the complexity of some instructions (e.g. DJNZ) and the fact it ran at 4MHz vs 1MHz for the 6502 (got a speed demon 65C02 card later). However I couldn't keep all instructions and timings in my head. Speedwise the 1MHz 6502 and 4MHz Z80 were on par.
I preferred, however, the fact that I/O was memory mapped on the 6502.
zabzonk 12 hours ago [-]
A few thoughts:
> the complexity of some instructions (e.g. DJNZ)
Well, of course the idea of DJNZ was to implement a very common pattern (decrement a register and jump (normally backwards) if the result was not zero) - this tended to simplify code rather than make it more complex.
> However I couldn't keep all instructions and timings in my head.
I was never really interested in the timings, but I did get to the stage (not by conscious memorisation) of being able to assemble and disassemble Z80 code in my head, with some accuracy.
> I preferred, however, the fact that I/O was memory mapped on the 6502.
Many (most?) Z80 systems used memory mapped I/O. It's down to the hardware designer.
paddybyers 12 hours ago [-]
> I did get to the stage (not by conscious memorisation) of being able to assemble and disassemble Z80 code in my head, with some accuracy.
Same here.
I never got any fluency using EXX and the shadow registers - there were so few situations it was worth the effort. I always felt like I must be missing something.
spc476 12 hours ago [-]
I suspect the shadow registers were there to improve interrupt handling, reserve their use exclusively to an interrupt handler, and you could save time in not having to store registers in memory.
exidy 7 hours ago [-]
> Speedwise the 1MHz 6502 and 4MHz Z80 were on par.
This is a bit of an exaggeration, the 6502 was efficient but not that efficient. While generally understood that the Z80 took 2x-4x ticks to execute instructions as the 6502, in the real world its larger register set meant properly-written Z80 code could avoid expensive, slow round trips to memory.
Outside of artificial benchmarks real world performance shows that the 6502 is roughly 2x as efficient per clock cycle as the Z80[0], i.e. a 1 MHz 6502 is approximately equivalent to a 2 Mhz Z80.
This is reflected in the computers of the day, i.e. TRS-80s were not being blown out of the water by Commodore PETs.
> I preferred, however, the fact that I/O was memory mapped on the 6502.
The Z80 could do memory mapped IO as well of course (used at least in some arcade machines), but why waste valuable address space when there's an entire 64 KB of extra address space reserved for IO ;)
nickcw 15 hours ago [-]
The Z80 spawned the 64180 which was a Z80 with loads of stuff built in (from Wikipedia)
Execution and bus access clock rates up to 10 MHz
Memory Management Unit supporting 512K bytes of memory (one megabyte for the HD64180 packaged in a PLCC)
I/O space of 64K addresses
12 new instructions including 8 bit by 8 bit integer multiply, non-destructive AND and illegal instruction trap vector
Two channel Direct Memory Access Controller (DMAC)
Programmable wait state generator
Programmable DRAM refresh
Two channel Asynchronous Serial Communication Interface (ASCI)
Two channel 16-bit Programmable Reload Timer (PRT)
1-channel Clocked Serial I/O Port (CSI/O)
Programmable Vectored Interrupt Controller
As a consequence it was really popular in the 90s as an embedded processor just when I was starting my career. This lead to me writing thousands of lines of Z80 assembly. You could program it in C but the compiler was useless at making stuff go fast.
One of those things I wrote was an LZ77 decompressor used in a satellite broadcast system. It took me about a week to write it, test it and optimise it. Quite a challenge! I remember optimising it about the LDIR instruction to copy memory.
The compressor was written in C and ran on the PCs of the day.
toolslive 15 hours ago [-]
I come from the same lineage as the author. I did 6502 (doing C64 demos) long before I encountered the Z80. From what I remember, the Z80 offers a vastly superior programming experience. It has more registers. it has 16 bit registers. It has a shadow register set (you can switch between sets, which is handy for interrupt routines, for example) Programming assembly on the Z80 just is less of a fight.
vardump 14 hours ago [-]
But 6502 has 256 registers! Full ZP of them.
DonHopkins 14 hours ago [-]
And the 6502 had one mouth (A) to taste and chew with, and two hands (X and Y) to move stuff in and out of the mouth with.
uticus 13 hours ago [-]
> Less immediately visible to someone working at the assembly language level instead of the machine code one is that relative addressing is much more common on the 6809, meaning that it’s significantly more viable to write position-independent code on it than any of the other chips we’ve looked at here. Only the 8086 comes close, and it achieves it by using its segment registers as a de facto relocation base.
I would love to learn more about this. Does more "position-independent code" mean the linker has much less to do [0], or is there an actual difference in the code base for similar tasks?
In theory, the motivation for position independent code was to support the development and use of software libraries that could be "plugged in" to an application.
In practice, RAM was often limited to 16 KB; software reuse that I'm familiar with on a 6809 platform was at the source-code level and optimized by the programmer.
I remember editing and assembling, but not compiling or linking.
That said, I believe Motorola wrote some floating-point libraries.
I was a kid on a Tandy Color Computer, and the $49.95 EDTASM cartridge was a huge investment for our family. So my point of view could be way off... but the simplicity of the Color Computer with the design of the 6809 made programming delightful. (20 years later, my enjoyment in programming the Palm Pilot felt like that... although by then I could use C as a fancy macro assembler.)
Larger and later systems could use OS-9, which reasonably resembled UNIX and maybe supported a C compiler.
analog31 10 hours ago [-]
I believe that linking became important when programs got too big to compile within the memory limits of the computer. Then you had to compile portions of the code into separate object files, and then "link" those object files by reconciling the identifiers with their addresses. Without having to haul everything back into the compiler at once. It also meant that portions of a program didn't have to be recompiled if they weren't ever changed.
This wasn't the only way to skin the cat. Multi-pass compilers were another way.
Relocatable code could make more efficient use of memory, for instance not having to worry that your object code would end up crossing a page boundary after linkage.
whartung 9 hours ago [-]
Technically it wasn’t the linker that’s simplified, it’s the loader.
Modern systems don’t worry about PIC code, they have virtual memory so everyone sees memory the same. The virtual memory system manages the relocation automatically.
OS/9 relies pretty much entirely on PIC code, that made the loader and multi-tasking easy.
Original MacOS also relied on PIC. For similar reasons, and it’s partly why code segments were limited to 32k.
Then you have things like the original 8086. As long as you stick with the “tiny”/“small” memory models, everything was relative to the segment registers, so code and data could be moved easily.
In contrast you had systems like the Apple IIGS. The 65816 does not support PIC well, so code segments carry a relocation table that allows the segment loader to relocate code during transfer from disk. The creation of segment and relocation table is the job of the linker.
spc476 12 hours ago [-]
In general, yes, the linker (if there is one) would have less to do.
Position independent code (PIC) on the 6809 is pretty easy [1], but it does increase the code side a bit, but the resulting code can be placed anywhere in memory with no changes and still work. As mentioned, Motorola intended to sell a ROM with IEEE-754 floating point routines for the 6809 (as the MC6839) that was PIC. As far as I could tell, they never did sell the ROM, but they did provide it (with source) for anyone to use.
[1] Relative branches instead of absolute jumps, using the index registers to address memory, as well as addressing relative to the program counter. You can still do jump tables, but instead of a list of addresses, they're just a list of relative jump instructions. That type of thing.
syncsynchalt 12 hours ago [-]
On older hardware such as this it could e.g. let you write a multitasking environment that supported shared libraries without use of an MMU (though you'd hit memory constraints pretty quickly on a Z80-era cpu!).
I'm not familiar with the instruction sets of the 6809 but I could also see more compact opcodes, e.g. a JMP with a relative offset can be encoded smaller than JMP with an absolute address.
In modern terms PIC is used for ASLR and is therefore a security requirement. Some arches (I'm most familiar with arm64) are entirely designed around PIC and you need extra hoops to do anything in absolute terms.
le-mark 12 hours ago [-]
In practice it just meant the “zero page” could be anywhere, not the 256 bytes starting at 0x00 (like the 6502 zero page). The opcodes that operate on zero page are shorter and thus faster.
nicole_express 13 hours ago [-]
I find programming for the 6502 to be a joy in a way that the Z80 isn't. I'm not quite sure why that is; I guess maybe because 6502 feels so stripped down, the amount of context you have to keep in your head is extremely low?
gblargg 13 hours ago [-]
I also found the 6502 far more enjoyable. It feels like a refined, minimal design, since the Z-80 does bolt things on to the 8080 (which I've also coded a lot for, also not as enjoyable). There tends to be one straight-forward way to code things, that's also the most efficient, barring changing the algorithm or twisting the design. Instructions tend to use one cycle per memory access, and memory accesses are mostly as expected, e.g. load A from 16-bit address is four cycles: the LDA opcode, two-byte address, and byte loaded. The Z-80 suffers in this regard because like the x86, it uses 1-2 bytes for the bolted-on instructions and modes, so some seemingly similar instructions can use a different number of cycles (e.g. ld hl,nn and ld ix,nn take different times, even though they both load a 16-bit value into a register).
analog31 10 hours ago [-]
I admit that I did everything with a Z80, except for using one. Because the TRS-80 used a Z80, Radio Shack carried a book on it, published by Howard Sams. It was next on the shelf after the TTL book, so of course I bought it and devoured it without ever touching a Z80.
My impression of the Z80 being clean and simple probably resulted from that book being so clearly written. It gave me a good enough understanding of how micro's work, that lasted until the more modern chips came out with things like pipelining. But I think that learning one of those old 8 bit chips would still be a great place to start for understanding things at a hardware level.
Dynamic memory refresh on chip was clever.
fredoralive 14 hours ago [-]
There's also the weird more than an 8080 but not quite a Z80 CPU in the Gameboy (which Sharp seem to call an SM83 when they used it in their microcontrollers).
Andrex 8 hours ago [-]
Was hoping this would be one of the the chips covered.
drzaiusx11 7 hours ago [-]
As someone familiar with z80, 6502, and 68HC11 instruction sets this breaks down the key differences between the "6s" and "8s" archs well. I personally learned z80 first, then found the 6502 a bit limiting after writing z80 for a bit. That said, I still love writing bare metal 6502 for my Atari 2600's handicapped 6507 (missing an address line among a few other oddities). Sometimes limits are a good thing. I certainly would never want to remember all the more complex cisc instruction sets for example.
daltont 12 hours ago [-]
"The 6809 saw some success, especially in arcade machines, but it did not steamroll the world the way the 6502 and Z80 did."
Could have mentioned the use of the 6809 in the Radio Shack TRS-80 Color Computer and the Dragon in the UK. Using the TRS-80 tag on something not using a Z-80 never made sense.
MarkusQ 15 hours ago [-]
TIL that my mind has dynamic memory and enjoys an occasional refresh cycle.
In 2025 I started programming 6502 assembly just for fun as intellectual exercise (i did TINY bit of x86 asm in the past) and MY GOD: this is so easy and so valuable to learn!
Programming 6502 seems simpler than learning lets say JS framework or to learn just about anything modern.
Its super fun, super easy and very rewarding.
I ended up designing my own ultra RISC, stack based and uniform 32-bit fixed-length data size (all instructions and data have exactly the same size) with mimo and other cool features. 6502 on steroids
I felt competent first time for long time as jobless programmer doing that :)
I acquired a Z-80 softcard for my Apple ][ (for trying out CPM) and was flabbergasted by the expanded register set, the complexity of some instructions (e.g. DJNZ) and the fact it ran at 4MHz vs 1MHz for the 6502 (got a speed demon 65C02 card later). However I couldn't keep all instructions and timings in my head. Speedwise the 1MHz 6502 and 4MHz Z80 were on par. I preferred, however, the fact that I/O was memory mapped on the 6502.
> the complexity of some instructions (e.g. DJNZ)
Well, of course the idea of DJNZ was to implement a very common pattern (decrement a register and jump (normally backwards) if the result was not zero) - this tended to simplify code rather than make it more complex.
> However I couldn't keep all instructions and timings in my head.
I was never really interested in the timings, but I did get to the stage (not by conscious memorisation) of being able to assemble and disassemble Z80 code in my head, with some accuracy.
> I preferred, however, the fact that I/O was memory mapped on the 6502.
Many (most?) Z80 systems used memory mapped I/O. It's down to the hardware designer.
Same here.
I never got any fluency using EXX and the shadow registers - there were so few situations it was worth the effort. I always felt like I must be missing something.
This is a bit of an exaggeration, the 6502 was efficient but not that efficient. While generally understood that the Z80 took 2x-4x ticks to execute instructions as the 6502, in the real world its larger register set meant properly-written Z80 code could avoid expensive, slow round trips to memory.
Outside of artificial benchmarks real world performance shows that the 6502 is roughly 2x as efficient per clock cycle as the Z80[0], i.e. a 1 MHz 6502 is approximately equivalent to a 2 Mhz Z80.
This is reflected in the computers of the day, i.e. TRS-80s were not being blown out of the water by Commodore PETs.
[0] https://github.com/soegaard/minipascal/blob/master/minipasca...
The Z80 could do memory mapped IO as well of course (used at least in some arcade machines), but why waste valuable address space when there's an entire 64 KB of extra address space reserved for IO ;)
Execution and bus access clock rates up to 10 MHz
Memory Management Unit supporting 512K bytes of memory (one megabyte for the HD64180 packaged in a PLCC)
I/O space of 64K addresses
12 new instructions including 8 bit by 8 bit integer multiply, non-destructive AND and illegal instruction trap vector
Two channel Direct Memory Access Controller (DMAC)
Programmable wait state generator
Programmable DRAM refresh
Two channel Asynchronous Serial Communication Interface (ASCI)
Two channel 16-bit Programmable Reload Timer (PRT)
1-channel Clocked Serial I/O Port (CSI/O)
Programmable Vectored Interrupt Controller
As a consequence it was really popular in the 90s as an embedded processor just when I was starting my career. This lead to me writing thousands of lines of Z80 assembly. You could program it in C but the compiler was useless at making stuff go fast.
One of those things I wrote was an LZ77 decompressor used in a satellite broadcast system. It took me about a week to write it, test it and optimise it. Quite a challenge! I remember optimising it about the LDIR instruction to copy memory.
The compressor was written in C and ran on the PCs of the day.
I would love to learn more about this. Does more "position-independent code" mean the linker has much less to do [0], or is there an actual difference in the code base for similar tasks?
[0] https://sourceware.org/binutils/docs/ld/Overview.html
In theory, the motivation for position independent code was to support the development and use of software libraries that could be "plugged in" to an application.
In practice, RAM was often limited to 16 KB; software reuse that I'm familiar with on a 6809 platform was at the source-code level and optimized by the programmer.
I remember editing and assembling, but not compiling or linking.
That said, I believe Motorola wrote some floating-point libraries.
I was a kid on a Tandy Color Computer, and the $49.95 EDTASM cartridge was a huge investment for our family. So my point of view could be way off... but the simplicity of the Color Computer with the design of the 6809 made programming delightful. (20 years later, my enjoyment in programming the Palm Pilot felt like that... although by then I could use C as a fancy macro assembler.)
Larger and later systems could use OS-9, which reasonably resembled UNIX and maybe supported a C compiler.
This wasn't the only way to skin the cat. Multi-pass compilers were another way.
Relocatable code could make more efficient use of memory, for instance not having to worry that your object code would end up crossing a page boundary after linkage.
Modern systems don’t worry about PIC code, they have virtual memory so everyone sees memory the same. The virtual memory system manages the relocation automatically.
OS/9 relies pretty much entirely on PIC code, that made the loader and multi-tasking easy.
Original MacOS also relied on PIC. For similar reasons, and it’s partly why code segments were limited to 32k.
Then you have things like the original 8086. As long as you stick with the “tiny”/“small” memory models, everything was relative to the segment registers, so code and data could be moved easily.
In contrast you had systems like the Apple IIGS. The 65816 does not support PIC well, so code segments carry a relocation table that allows the segment loader to relocate code during transfer from disk. The creation of segment and relocation table is the job of the linker.
Position independent code (PIC) on the 6809 is pretty easy [1], but it does increase the code side a bit, but the resulting code can be placed anywhere in memory with no changes and still work. As mentioned, Motorola intended to sell a ROM with IEEE-754 floating point routines for the 6809 (as the MC6839) that was PIC. As far as I could tell, they never did sell the ROM, but they did provide it (with source) for anyone to use.
[1] Relative branches instead of absolute jumps, using the index registers to address memory, as well as addressing relative to the program counter. You can still do jump tables, but instead of a list of addresses, they're just a list of relative jump instructions. That type of thing.
I'm not familiar with the instruction sets of the 6809 but I could also see more compact opcodes, e.g. a JMP with a relative offset can be encoded smaller than JMP with an absolute address.
In modern terms PIC is used for ASLR and is therefore a security requirement. Some arches (I'm most familiar with arm64) are entirely designed around PIC and you need extra hoops to do anything in absolute terms.
My impression of the Z80 being clean and simple probably resulted from that book being so clearly written. It gave me a good enough understanding of how micro's work, that lasted until the more modern chips came out with things like pipelining. But I think that learning one of those old 8 bit chips would still be a great place to start for understanding things at a hardware level.
Dynamic memory refresh on chip was clever.
Could have mentioned the use of the 6809 in the Radio Shack TRS-80 Color Computer and the Dragon in the UK. Using the TRS-80 tag on something not using a Z-80 never made sense.
Thank you!
https://www.llvm-mos.com
In 2025 I started programming 6502 assembly just for fun as intellectual exercise (i did TINY bit of x86 asm in the past) and MY GOD: this is so easy and so valuable to learn!
Programming 6502 seems simpler than learning lets say JS framework or to learn just about anything modern.
Its super fun, super easy and very rewarding.
I ended up designing my own ultra RISC, stack based and uniform 32-bit fixed-length data size (all instructions and data have exactly the same size) with mimo and other cool features. 6502 on steroids
I felt competent first time for long time as jobless programmer doing that :)