The Commodore 64 and 1541 Floppy Drive: The Micromanaging Bosses

April 21, 2021
Bamboo waving a 5.25 inch floppy disk in the air.

Ever have a micromanaging boss? Someone who just didn’t think you could do the job right and insisted on doing it all for you? Like, even if you were screwing up in the past, but you worked real hard and got your act together, they would still watch you like a hawk ‘cause they couldn’t trust you? Well, that’s kinda sorta what happened to the data connection between the Commodore 64 computer and the Commodore 1541 floppy drive.

Commodore built a couple of different floppy drives for their early computers, and it’s in the 5.25" 1540 floppy disk drive, made for the VIC-20, the C64’s predecessor, where all the trouble started. The 1540 has two processors – a 6502 that handle talking to the computer and understanding the files on the disk, and a 6504 that handle spinning the disk and moving the read/write head. They send data to and from the computer using a pair of MOS 6522 Versatile Interface Adapter (VIA) chips, one in the computer and one in the drive. These chips work together to transfer data to and from the CPUs in each device.

One of the big reasons for using a 6522 VIA in the VIC-20 was because of a skill it has called the shift register. A register is an area of the CPU it has immediate access to perform operations on, and it’s the fastest thing it can manipulate. When you read a byte of data (the word size of the 6502) off the floppy drive, the CPU can put it into this shift register of the 6522 and then shift a bit of it at a time to another source, like the 6522 VIA in your VIC-20. All the VIC-20 has to do is signal to the drive that it wants the next byte once you’ve received 8 bits. This is very fast since this action is hard-wired into the 6522s, and you can hook these registers up to each other. The CPUs have timers, like drums that they’re beating, so that they and their respective 6522s stay in sync with each other, putting bytes on and taking bytes off their shift registers for future computing use at the right times. It’s a clever setup that just works.

Two cat managers drumming along keeping time while two muskrat workers are moving bits along on a conveyor belt.

Well…it would “just work”, if the MOS6522 VIA didn’t have a bug in it. Occasionally, that signaling between the VIC-20’s and 1540’s shift registers would arrive a little too close to the drumbeat timer the respective CPU had with their 6522. If that happens, the 6522 would get confused, and ignore one of the bits they were shifting.

A worker turns to the manager, who is drumming, and drops the bit.

This is, of course, not good. The VIC-20 doesn’t know they’re missing a bit, the 1540 doesn’t know that they dropped a bit, and crashes would occur.

Now, other computers of the era use 6522s for communications with peripherals. The first Macintosh has a 6522 to accept keyboard input, but since the Mac initiates communication with the keyboard, rather than letting the keyboard stream everything over unchecked, it knows if it receives bad data and can re-request it. Not so with Commodore floppy drives.

Since this hardware-based fast transfer of data couldn’t be trusted, Commodore engineers had to write software that would require the CPUs to take over the jobs of the shift registers, extracting data one bit at a time from the drive and verifying each one. Writing software the CPU executes to handle these sorts of transfers manually instead of letting the hardware handle it is called bit-banging, which–

Bamboo telling us to keep it safe for work

–get your mind out of the gutter, Bamboo. Anyway, the VIC-20 CPU now has to take time out of its busy schedule to coordinate with the floppy drive CPU to gather every bit, one at a time, and assemble them into a byte themselves, acting as a very micromanage-y manager over each 6522:

The cat managers are telling the muskrats to stay focused as the muskrats manually push one bit at a time across a decommissioned conveyor belt.

When the Commodore 64 and 1541 drives came out, the buggy 6522 was replaced with the 6526 CIA, which did not have the shift register issue. Theoretically, these two chips could have been hooked together as intended in the VIC-20 days. But, to ensure that the new 1541s could be used on existing VIC-20s, the C64 and 1541 drive kept using the inefficient CPU-driven bit-banging to send data back and forth. It was made even worse when the video hardware on the C64, the VIC-II chip (the VIC-20 was named after the first generation of the chip), needed to completely halt the CPU many times a second to handle screen drawing. For this reason, the drive was purposely slowed down even more so it would work in the C64 reliably. So a bad situation got even worse!

Two muskrats complaining about being overqualified for the job now and about Vic.

The combo C64 and the 1541 were slow. Like, really really slow. The inefficient management of the bit-banging protocol and fighting VIC-II for CPU time made floppy disk usage awful, esepcially compared to Commodore’s competitors. But, just like at work, a change of management can make a world of difference for the workers. And that’s where fast loader software and cartridges come in.

Except for things written in BASIC, all the code in the C64 is 6502 machine language code and lives in unprotected memory. It’s easy to read and modify any code at any location in the computer, including the code that controls floppy disk communication. Since the 1541 also has a 6502 CPU and RAM, all the code there can be read and modified, too. The simplest way to make the connection between computer and floppy drive faster is to replace the code that runs the bit-banging protocol between the devices. There were a bunch of different cartridges sold, usually by game manufacturers like Epyx and Cinemaware, that would modify the code in the C64 and 1541 to replace the inefficient bit-banging managers with much more efficient, but still bit-banging, ones. It wasn’t as fast as hooking up the shift registers, but it was, even with the most simplistic, naive code, at least 5 times faster than the original Commodore code.

Two coyotes are cheering on two muskrats who are passing entire bytes at a time along the conveyor belt. Everyone is in winter sports gear.

Eventually, the shift registers were hooked up between the Commodore 128 and the 1570/1571 drives (which also had 6526s), and fast loaders weren’t as necessary for performant disk access. The CPU could spend its time doing other things, and no more micromanaging managers were needed to ensure the safety of the data coming off the disk.

Changelog

  • 2021-04-21: Initial post

The NES: The Game State That Lasted Seven Days

March 28, 2021

Bamboo renting Final Fantasy, declaring he will beat it in seven days.

Before the first NES game with battery backup, The Legend of Zelda, came out, you had two options:

  • Either you beat the whole game in one sitting
  • Or you got these weird passcodes you had to type in to restore where you were.

Old-school Metroid players knew of JUSTIN BAILEY, the amazingly random password that gave you all sorts of crazy gear. Mega Man had you put dots in spots to get your gear back. These both worked by recreating the state of the game.

When a game (or any computer program) runs, it interacts with the outside world. You move a character, an AI shoots at you, you drink a healing potion. These decisions change the program state. Every program that does anything useful has state. This state is stored in the NES’s powered-on RAM while you play, and is continuously updated. On the NES, this data would be in the 2K of RAM available to games. But when you power down the NES, the state goes away with it. The trick is finding a way to preserve the game state so you can keep playing from where you left off.

With a game like Mega Man 3, the state is very simple, and it’s easy to store and recreate from a few dots on the screen. All you need to know is:

  • What bosses were defeated
  • How many energy tanks you had

A password for Mega Man 3 for being Magnet Man with 0 E-Tanks

Need a password for having defeated Magnet Man with 0 Energy Tanks? Well here it is. I got it especially for you. Top Man is next if you go by that one copy of Nintendo Power I still own.

With a game like Final Fantasy, though, there’s no way all of the game state can be stored in a simple password:

  • Every warrior has a name and individual stats, gear, and spells
  • You have different items and counts of items
  • You’ve unlocked different parts of the game

Bamboo, dressed as a Fighter, not looking forward to entering in a grid password to restore his saved game.

This 38x38 grid of 2 bit (3 different dots + empty) squares will give you just over 768 bytes of data, enough to represent the important parts of the Final Fantasy God Mode save file.

The memory magician, as a red mage, casting a spell on a RAM chip

Final Fantasy used a common memory magician – I mean, memory management chip – named the MMC1. The MMC1 allowed for storing game state on a RAM chip on the cartridge board that was powered by a button battery.

The code of the game could talk to the memory magician on the board and let it know it wanted to read and write to the cartridge’s RAM, treating it as part of one of the PRG banks loaded and saved in the upper 32K of the NES’s RAM. Normal RAM chips typically need constant electricity to remember what was stored there (unlike EEPROMs – Flash memory is a type of EEPROM used to make SSDs). When the game was in the NES and powered on, these on-board RAM chips were powered by the console. When the power cut out because you held in the RESET button while powering off the NES (or you had to shut it off quick ‘cause it was time for dinner and Mom had already called you, like, ten times), the battery on the cartridge board took over and kept the onboard RAM powered up just enough to preserve the game state.

There were problems with this, of course. Batteries could die, and couldn’t be user-replaced easily. The hardware on the cartridge might not handle the transition from NES-power to battery-power well, corrupting the fragile state. You might forget to hold down RESET while powering off the system (or be forced to not do so because of dinner) and the CPU continued to execute, corrupting your game. But for most games, and most of the time, it worked. The battery backed-up RAM allowed players to play much larger games, paving the way for truly large RPGs like Final Fantasy and Dragon Warrior, adventure games and ARPGs like The Legend of Zelda and Crystalis, and in-depth simulation games like Bandit Kings of Ancient China.

Bamboo returning Final Fantasy, not having beaten it in the 7 day rental period.

And yes, I did try to beat Final Fantasy this way twice, in two separate 7 day rental periods, and could never pull it off fast enough.

Changelog

  • 2021-03-28: Initial Post

The Amiga: Meet Agnus, the Fastest, Mathiest Painter

March 16, 2021
Agnus, in a smock, painting a combined picture, wondering how many times the painting will need to bedone every second.

Painting a reproduction from life is a challenge. At least, it is for me. It gets even harder when you’re trying to paint from life and combine in additional elements to the piece. It makes it even harder when you need to do this at least 30 times a second. It gets even harder when you’re a silicon chip and you can’t even hold a paintbrush!

One of the great things about the custom chipset in the original Amigas (the Original Chip Set, or OCS, and the Enhanced Chip Set, or ECS) are the tools they have to do 2D graphics very, very well. Agnus, the controller of Chip RAM, comes equipped with a tool called a blitter, whose purpose is to copy big blocks of memory around in Chip RAM very quickly.

Denise is the chip responsible for outputting video to the monitor, and uses two sources of data to make the final video. One of those sources are sprites. Denise can render up to 8 sprites on a single line of the screen. The first sprite you ever interact with on the Amiga is the mouse pointer. Sprites can have 3 or 15 colors, they are 16 pixels wide, and can be as tall as you’d like. Denise can draw them and move them around very quickly, and you can do a lot with just a handful of 16 pixel wide sprites. The NES certainly was able to.

Denise, surrounded by sprites, going 'whee!'. Agnus, with canvases, going 'whee indeed'.

The other source is bitplanes made up of Chip RAM. The Amiga came out at a time when the number of colors on the typical computer screen could be counted on no more than a few hands. Denise came out in 1985, and it wasn’t until around 1987 that IBM PCs got 256 color graphics. The Amiga strived for flexibility and efficiency with 2D graphics, so Denise combines together multiple 1-bit planes (rectangles) of pixels into the full-color Amiga screens. If you want a 16 color screen, Denise requires 4 1-bit planes of pixels, which are combined together with Boolean math (2^4 = 16). If you only need two colors, Denise uses a single 1-bit plane (2^1 = 2).

Bamboo saying it's getting mathy

The blitter copies rectangles of data from up to three different sources, and writes the result to the destination. Those four locations are all in Chip RAM, which Agnus and Denise have direct access to. You give Agnus the size of the rectangle, and, for each source/destination, the upper-left corner of the rectangle.

Once you’ve got these pieces of data ready to copy, you combine them together. We want Agnus to paint a spaceship onto the background of our game, but we want to knock out the pixels around the ship so the background shows through. To do this, we will need three things:

  • A mask of the spaceship
  • The spaceship graphic
  • The background

Then, we need to instruct Agnus how to put all this together, and we’re gonna do it with BOOLEAN ALGEBRA.

The three different sources – the upper-left corners and the total rectangle size – are labeled A, B, and C.

We have an outline of the ship, drawn on a 1 color field, where color 1, what we want to keep, is the area inside the spaceship, and color 0, what we want to discard, is the shape outside the spaceship. We’re putting this in source A.

The spaceship is a graphic image. We want to draw it as-is. We’re putting this in source B.

Finally, we have our background. It’s also a graphic image that we want to draw as-is. We’re putting this in source C.

The three sources for the blitter operation

In order to draw our spaceship onto the background without destroying the background around the spaceship, we need to:

  • copy the parts of the spaceship (B) that are also color 1 in the mask (A)
  • OR
  • copy the parts of the background (C) that are NOT color 1 in the mask (A')

We can represent this logic in a Venn diagram, and turn that Venn diagram into a thing called a minterm, D = AB+A'C, which is what all the Amiga documentation (and all of Boolean algebra) call the shorthand form of this.

A Venn diagram showing the cookie-cutter operation

Running this logic on our images, we get:

The combined sources to make the final image

So now we’ve done our Boolean algebra and know how to combine these together. There’s one last part, and it has to do with how Agnus’s buddy Denise renders screens.

Since Denise wants bitplanes, Agnus’s blitter also needs to work with bitplanes. When you’re assembling your 16 color spaceship, you needed to do four copies, one for each bitplane you wanted to copy from-and-to. You can reuse the same mask for each of these operations, of course. This means that, once you started making 32 or even 64 color Extra Half-Bright games on the Amiga, copying lots of these big Blitter Objects, or BOBs, can really slow things down.

Agnus handing Denise a folder of 6 painted bitplanes, with a messed up painting arm.

Agnus’s speed and coordination with Denise on these high-speed painting projects means the Amiga can make some truly amazing games and applications that don’t have to resort to as many tricks that systems like the NES have to use to paint large moving objects.

If you want to see a reproduction of the blitter in action, I’ve built a TIC-80 demo that lets you try combining together a few pictures with a few of the more common blitter modes so you can see the results. Just like on the Amiga, it works purely in bitplanes under the hood. Some of the images are designed to be 1-bit masks, and the others are 4-bit, 16 color images (TIC-80’s maximum number of colors). Try it out! You can see the code by hitting Esc, then F1. It’s all in Lua, so if you know Ruby or Python, it should be pretty readable.

(Note that this might not work well if you’re on a phone.)

Changelog

  • 2021-03-16: Initial post

Self-Portrait March 2021

March 11, 2021

I painted a new Bamboo self-portait for March 2021!

This one was a 1 hour speedpaint once I had the drawing in place. For the next one I want to focus on improving the contrast between foreground and background a bit better. I also have to not mess up the lighting and have to redo it, wasting a bunch of minutes in the process. Progress! New one in April.

The NES: Cartridge Constraints

March 9, 2021

Constraints can be annoying. Being limited in what you are able to do is frustrating. But the great thing about constraints is that they force you to think of new, creative solutions to work around them, ones you wouldn’t ordinarily try out.

A common thing I’ll do in creative projects is use random generators to determine names, species, clothing, scenarios, whatever I need to make the project work. Once I’m given these immutable choices, it’s up to me to craft a story, design, or piece around them. It can be freeing to know what the limits are, so you can focus on the parts you can change.

An African Wild Dog, in overalls, a cummerbund, and a kilt, lamenting about their issues
The result of a few random thing/idea generators. Yeah, my stories can get weird sometimes.

The venerable Nintendo Entertainment System had such constraints. It had essentially the same processor, a 6502, as a Commodore 64/128, so it could only address 64K of RAM. The NES used 32K of that RAM for storing program variables and for things like controller input data, and the remaining 32K was used to load in code from the cartridge.

The NES also had a video card, a Picture Processing Unit (PPU), that could address 64K of RAM, but only had 16K for cost reasons. 8K of that was for graphics data – “patterns” that could be use to make UIs, maps, sprites, whatever – and 4K for “pattern tables”, used to store the map data. When you hear maps, think the backgrounds and (most) environment objects that Mario is jumping onto or in front of. Sprites are Mario himself and all the enemies, powerups, and so on. They’re special because they need to move anywhere on the screen, and quickly.

When the NES (or rather, the Family Computer or Famicom) was first designed, the cartridges it used were divided into two pieces. One part, PRG-ROM, stored the program code, and one part, CHR-ROM, stored the graphics data. PRG-ROM came in 16K increments, and CHR-ROM in 8K increments. Typically, the first and last 16K PRG-ROM chunks are made available to the CPU when you start up the NES, and first 8K CHR-ROM chunk, containing patterns for maps and sprites, is made available to the PPU.

Very simple games, like Excitebike, came with only 16K of code and 8K of graphics. This code and graphics data fit perfectly into the NES’s limited code-and-graphics RAM.

But what about the first NES game I ever rented, Mega Man 3? Well, that came with 256KB of code and 128KB of graphics data. Dr. Wily’s creations were Big Bosses, after all. How did that work when the NES could only manage 40K?

Once game manufacturers realized that 40K was not going to cut it, they did what Commodore did when you have processors that can only address a set amount of RAM and you need to expand:

The memory magician, looking at a Help Wanted ad from Nintendo.

They hired memory magicians!

These magicians, known as Memory Management Controllers or mappers in the Nintendo and emulation world, use magic to trick CPUs into thinking there’s only a certain amount of ROM – the 32K of code and 8K of graphics – when in reality they’re hiding the rest of it. It’s how the Commodore 128 was able to double the RAM of its predecessor (64K -> 128K), despite the CPU only being able to see 64K of data at a time.

Excitebike had no magician. The NES hardware had full access to the contents of the cartridge with nothing standing in its way.

Mega Man 3 had a powerful magician known as MMC3 that could swap in lots ands lots of CHR-ROM and PRG-ROM.

Many games hired magicians that all operated in very different ways, depending on the game involved. There were dozens of different ones, each solving particular problems and offering certain capabilities. There were 4 big ones, along with “no mapper”, or NROM:

Excitebike

Memory magician with balloon motorcross bike

The first time I played this was on a PlayChoice-10 cabinet at a roller skating rink.

  • NROM (no mapper)
  • 16K PRG-ROM
  • 8K CHR-ROM

Arkista's Ring

Memory magician with balloon bow

I got this super cheap at a flea market.

  • CNROM (CHR-ROM mapping only)
  • 32K PRG-ROM
  • 32K CHR-ROM

Contra

Memory magician with balloon machine gun

I recently rewatched Predator.

  • UNROM (PRG-ROM mapping only)
  • 128K PRG-ROM
  • No CHR-ROM! All the graphics are stored in code and dumped into a CHR-RAM chip as needed. The NES treats the contents of this RAM chip exactly as it would treat a ROM chip.

Dragon Warrior

Memory magician with balloon sword and shield

The first RPG I ever played to completion.

  • MMC1
  • 64K PRG-ROM
  • 16K CHR-ROM
  • Battery backup, the "hold down RESET and power off" variety

Mega Man 3

Memory magician with balloon helmet

The first game I ever rented, and the first Mega Man game I ever played.

  • MMC3
  • 256K PRG-ROM
  • 128K CHR-ROM

This use of memory management controllers, a clever approach to work around some very limiting constraints, allowed the NES to grow and expand far beyond its capabilities, and allowed for making some truly amazing games for a console with such humble specifications.

Changelog

  • 2021-03-09: Initial post

The IBM PC: More Memory, Courtesy Your Keyboard

February 27, 2021

One of the best things about networked computers is being able to download more RAM when you need it. It’s pretty handy, I’ve been doing that since the BBS and modem days. I still download RAM on my smartphone, and it–

I tell Bamboo that RAM doublers don't actually work and he smashes his phone.

Well…what if I told you that, on older IBM PCs, there was a key you could hit on your keyboard that would give you…

Bamboo says 'All the RAM you could possibly ever need, I say no, he drops an IBM M Series keyboard.

Ok, fine…sigh…it’s not actually a key on your keyboard, but it is in the thing your keyboard plugs into to talk to the rest of the computer. And…yes, it can give you more RAM. Kinda sorta.

The IBM PC was designed from the start to have max 1MB of RAM, because that’s the most the 8086 could handle and RAM was expensive back then. Due to how the chip was designed, accessing all that 1MB was…a little odd.

Processors use memory addresses, a number starting from 0, to figure out the place in RAM where data can be written to and read from. The 8086 was a 16 bit processor, as the size of pieces of data it was designed to work with was 16 bits - 2 bytes, or the numbers 0-65535 - wide. Programmers call the size of data a processor is designed to work with a word. This means that, if an 8086 only used one word as a pmemory address, you’d have at most 64 kilobytes to use. Even the Commodore 128 could do better than that.

Processors will have a different sized value, separate from the word size, as the address width, which they use to find stuff in RAM. Usually the address width is double the word size, or some multiple of 8 bits so you can stick with bytes for everything. Keeping it in bytes makes it easier for processors, and easier on programmers when writing software. The 8086 has an address width of 20 bits, which is 2 and a half bytes, or one and one quarter words.

Of course it has to be different.

So instead of taking two words and sticking them together, like on a Commodore 64 whose word size is 8 bits:

C64 memory addresses

Or, doing what Apple did on the 68000 and taking two 16 bit words, sticking them together, and using one of the bytes in there as data for the garbage collector:

68000 Mac memory addresses, made from two words, one byte in the word being for flags

The 8086 stuck two 16 bit words, a segment and an offset, together with MATH. Hexadecimal math.

Segment:offset 1234:ABCD wraps to 1C0FD
Segment:offset FFFF:FFFF wraps to 10FFEF

It’s a neat approach. But it has a problem. The 8086 would lop off everything that went over those last five characters, like how letterboxing a show shot at 16:9 and broadcast on 4:3 TVs would cut out the stuff no one expected to see:

A sitcom shot with two characters in a scene, letterboxed.

Take away that assumption by releasing the show on DVD in 16:9 format, and you start running into some unexpected results:

The same scene, but with the letterboxing removed, and you can see additional things on the sides.

The addresses FFFF:FFFF and 0FFE:000F go to the same memory location, 0FFEF, on an 8086, but not on a 286, where they would go to 10FFEF and 0FFEF, respectively. A program could get quite confused if it writes to 10FFEF and then hopes to find the data again at 0FFEF.

So what’s the solution? You hard wire the part of the 286 (Addressing line 20, or the A20 line) that allows addresses to go over 20 bits to a switch on the motherboard. If the switch is off, addresses wrap around, and if it’s on, they don’t. And the best place for that switch? Why, an unused part of the keyboard controller, of course!

Now, you can’t expect humans to know to push a switch on their keyboard to activate this. The switch stays off, unless a program like Rebel Assault or the operating system specifically requests it. But when that switch is turned on?

It’s 10000% totally exactly like downloading more RAM.

Changelog

  • 2021-02-27: Initial post