Saturday, May 28, 2016

Dopefish goes NTSC: Commander Keen 4 Composite CGA Patch Notes

If you're just landing here at random and wondering about the title: this is a 16-color 'remaster' of the original CGA version of Commander Keen IV: Secret of the Oracle, with code patched and graphics redrawn and reworked to take optimal advantage of CGA's composite output capabilities.

For more info (plus the download link), see the VOGONS thread - all sorts of cool stuff in there, like videos recorded from real hardware, and a DOSBox build patched with some useful additions for running this.  However, I've had a request or two for the technical nitty-gritty, so here's where I'm gonna dump it (careful what you wish for? ;-))

This patch started out as direct modification of the .EXE, following some disassembly and analysis.  I "ported" it to the CKPATCH format (for in-memory patching) only when I was done, by generating a binary diff.  This is why my notes will follow the disassembly, which is easier to comment on; all offsets are relative to the beginning of the load image (=file offset minus 2C00h) in v1.4-Apogee of the CGA executable.

Patch space

For new data, I found some 119 bytes that could be reused at offset 31548h.  In KEEN4C.EXE this has the color tables for the EGA fade routines - they're still present in the CGA executable for no discernible reason.  I ended up using only 36 bytes here, for extending the color tables used by various CGA drawing routines.

Finding space for new code was a bit trickier.  I ended up settling on 11FDDh, which contained a seemingly-unused function (a pointer to it is set during initialization, but then quickly set to something else before it ever gets called... at least as far as I could determine).  No idea what this code was supposed to do -- might be a leftover from the Keen Dreams code, since it looks similar to some routine used to cache KDR level data; if you know what's going on there, I'm all ears.  Whatever it is, it gives us a very generous 367 bytes to stomp over, plenty more than the 96 I eventually needed.

Mode setting

First, let's change the 80x25 text screens from mode 3 to mode 2 - this disables NTSC color burst on the CGA, and the B&W picture results in more readable 80-column text on composite (of course, RGB monitors are not affected).  There are two occurences of that:

1789A  B002            mov   al, 2          ; was: 3        
1789C  B400            mov   ah, 0
1789E  CD10            int   10h            ; SET VIDEO MODE
1789E                                       ; AL = mode     

1AED2  B80200          mov   ax, 2          ; was: 3        
1AED5  CD10            int   10h            ; SET VIDEO MODE
1AED5                                       ; AL = mode

Then there's graphics mode, which is set at 1AED9h.  Keen uses mode 4 (320x200 @ 2bpp) with that infamous eyesore palette of cyan/magenta/white.  We could still get 16 colors out of that on composite -- in fact that's what happens with the original game, although this is obviously not by design.  I chose mode 6 however (640x200 @ 1bpp), mostly for cosmetic reasons: its palette of artifact colors is much more useful for Keen, plus it's more consistent between the 'old-style' and 'new-style' IBM CGA variants.

Composite CGA artifact color palette (mode 6, foreground F)

This requires an additional step, since in mode 6 the NTSC color burst has to be enabled manually with a register write, whereas mode 4 has it on by default.  Otherwise we'd just get a B&W picture here too.  This is where the new code space comes in handy:

1AED9  9A8D13C510      call  newColorMode6  ; 10C5:138D - needs relocation!
1AED9                                       ; was: mov ax,4 ; int 10

11FDD  B80600          mov   ax, 6
11FE0  CD10            int   10h            ; SET VIDEO MODE
11FE0                                       ; AL = mode
11FE2  BAD803          mov   dx, 3D8h       ; CGA mode control register
11FE5  B01A            mov   al, 1Ah        ; burst on
11FE7  EE              out   dx, al
11FE8  CB              retf

New fade routines for CGA

KEEN4E messes with EGA palette registers for its fade-in/out effects, but the CGA version just drops the fade schtick altogether.  I suppose they could've used the darker version of the cyan/magenta/white palette, but that gives you just one intermediate step between full brightness and a black screen.  What's nice about composite mode 6 is that it lets us go one better -- and without those bloated color tables that the EGA version needs, either.

See, CGA lets us modify the foreground color in mode 6 (the default is F = intense white), unlike mode 4 where the same bits control the background color.  In color composite mode this causes all 16 artifact colors to change accordingly (well, except black).  For an EGA-like 4-step effect we can simply go "F, 7, 8, 0" to fade out, and the reverse to fade in.

Ah, but how can we splice this code into the CGA version?  In yet another stroke of luck, the fade functions are still present in the CGA .EXE, and called from all the right places -- they're just do-nothing stubs.  Even more luckily, these stubs don't just return immediately: thanks to Borland C calling conventions, they monkey around with BP and SP first.  That brings them up to 5 bytes, just enough for a far jump to new code (not a far call, since that would push a return pointer onto the stack, and we have no room for an extra retf to deal with that).

The fade-out and fade-in stub routines are located at 1AF67h and 1AF6Ch respectively, so let's replace those with the jumps:

; old_VW_FadeOut:
1AF67  EAAF13C510      jmp   newFadeOut     ; 10C5:13AF - needs relocation!
; old_VW_FadeIn:
1AF6C  EA9913C510      jmp   newFadeIn      ; 10C5:1399 - needs relocation!

In turn, the new fade code writes the appropriate values to the color register, calls the game's existing wait-for-vblank routine (VW_WaitVBL) for timing, and sets some required variables before returning to the original caller (not the stub).  Waiting for 6 VBLs between each step gets us a nice EGA-like, CPU-independent effect:

11FE9  B80800          mov   ax, 8          ; 8 = dark grey
11FEC  E82C00          call  outAndWait     ; near $+2Ch
11FEF  40              inc   ax             ; 7 = light grey
11FF0  E82800          call  outAndWait     ; near $+28h
11FF3  B00F            mov   al, 0Fh        ; F = white
11FF5  E82300          call  outAndWait     ; near $+23h
11FF8  C70681C40000    mov   screenfaded, 0
11FFE  CB              retf 
11FFF  B80700          mov   ax, 7          ; 7 = light grey
12002  E81600          call  outAndWait     ; near $+16h
12005  B008            mov   al, 8          ; 8 = dark grey
12007  E81100          call  outAndWait     ; near $+11h
1200A  33C0            xor   ax, ax         ; 0 = black
1200C  E80C00          call  outAndWait     ; near $+0Ch
1200F  C70681C40100    mov   screenfaded, 1
12015  C606975E01      mov   fontcolor, 1
1201A  CB              retf
1201B  BAD903          mov   dx, 3D9h       ; CGA color control register
1201E  EE              out   dx, al
1201F  B006            mov   al, 6          ; wait 6 VBLs
12021  50              push  ax
12022  9A290BC51B      call  VW_WaitVBL     ; 1BC5:0B29 - needs relocation!
12027  58              pop   ax
12028  C3              retn

The important variable here is "screenfaded" (address 38CA1h), which is checked by the keystroke handling routine at 5CECh.  The game shouldn't process keypresses while the screen is faded, or you may find yourself playing a level or navigating a menu on a completely black screen -- not terribly entertaining, I assure you.  This proved to be a pain in the derriere until I had a look at what the EGA version was doing.

You'll notice that I also set "fontcolor" (326B7h) to 1 after fading out, which is a bit of a nasty kludge to fix some unseemly text discolorations.  There's probably a slicker solution to that, but we'll get to the text stuff in a bit.

Expanded color lookup tables

Besides sprites, tiles and pictures, the engine also draws single color elements onto the virtual screen segment (lines, rectangles and text), and does it often.  In the CGA version, these functions all use 4-color lookup tables to do their business; to exploit our new-found 16-color composite palette, we're gonna need more than that.

These are the original tables - their roles are explained by the Keen Dreams sources (evidently that part of the code didn't change between KDR and CK4); the C functions use an array of bytes, and the ASM functions use an identical byte array and a word array:

; old_c_colorbyte
315CB  0055AAFF        db  0, 55, 0AAh, 0FFh
; old_asm_colorbyte
32674  0055AAFF        db  0, 55, 0AAh, 0FFh
; old_asm_colorword
32678  00005555[...]   dw  0, 5555h, 0AAAAh, 0FFFFh

A byte is 4 pixels in low-res CGA-land, so on screen we get (in base-4) 0000, 1111, 2222, 3333 - solid black, solid cyan, solid magenta, solid white.  In composite color modes however, each nybble represents a color, at half the resolution (this is not entirely accurate, but it's a useful visualization device).  That would translate to 00 (black), 55 (grey), AA (grey) and FF (white).  The two greys, by the way, have the exact same luminance.

To get a bigger range of colors that don't look like crap, we add two new tables in the data patch space:

31548  0055AAFF[...]   db  0,55h,0AAh,0FFh,44h,22h,0CCh,77h,0BBh,99h,66h,0DDh
31554  00005555[...]   dw  0,5555h,0AAAAh,0FFFFh,4444h,2222h,0CCCCh,7777h,

I only used 12 out of the new 16 colors there.  There's enough room for the whole bunch, but I didn't need them all; and since I initially tried (and failed) to cram all the new *code* along with the data, space was at a premium.  They're not in order either: the first 4 values duplicate the old tables, so whenever the color stays the same the index does too.

Of course, the functions that use these tables need to be told about the new locations:

; VW_Hlin (draw horizontal line) +0Eh:
1B22C  8A87284D        mov   al, newNTSCcolorByte[bx]
; VW_Plot (plot pixel) +2Bh - may be unused, but whatever:
1BC83  8A8F284D        mov   cl, newNTSCcolorByte[bx]
; VW_Vlin (draw vertical line) +2Bh:
1BCC2  8A9F284D        mov   bl, newNTSCcolorByte[bx]
; VWL_XORBuffer (xor buffer to virtual screen) +09h:
1C3B4  8B87344D        mov   ax, newNTSCcolorWord[bx]

Recoloring various elements

This is where the boring part comes in: hunting down every piece of code that calls the above-mentioned graphics routines, figuring out what it does, and changing the color argument accordingly.  Each value is an index to one of our new color tables.  I'll only list the ones that I actually modified - about half of them were left untouched (mostly where black or white were appropriate):

04B89  BF0500          mov   di, 5          ; death prompt hilite 1
04B8E  BF0600          mov   di, 6          ; death prompt hilite 2
0610F  B80700          mov   ax, 7          ; in-game status panel BG
188BE  B80800          mov   ax, 8          ; menu - bottom border
18911  B80800          mov   ax, 8          ; menu - top border
18A71  B80800          mov   ax, 8          ; menu popup - top border
18A8C  B80800          mov   ax, 8          ; menu popup - bottom border
18AAA  B80800          mov   ax, 8          ; menu popup - left border
18AC5  B80800          mov   ax, 8          ; menu popup - right border
18C9F  B80800          mov   ax, 8          ; menu popup - separator
1930C  B008            mov   al, 8          ; key config - focus on
19310  B009            mov   al, 9          ; key config - focus off
19A36  B008            mov   al, 8          ; save/load - focus on
19A3A  B009            mov   al, 9          ; save/load - focus off
1A4A8  B80800          mov   ax, 8          ; paddlewar - top edge
1A4C0  B80800          mov   ax, 8          ; paddlewar - bottom edge

Nope, we're not done yet... there's also this bunch of text displays, where the "fontcolor" variable gets the same treatment:

03F17  C606975E01      mov   fontcolor, 1   ; high scores - active input only
07A84  C606975E01      mov   fontcolor, 1   ; text screens - page counter
18800  C606975E08      mov   fontcolor, 8   ; menu option - focus on
18807  C606975E09      mov   fontcolor, 9   ; menu option - focus off
1883A  C606975E09      mov   fontcolor, 9   ; menu legend
18C65  C606975E08      mov   fontcolor, 8   ; menu popup text
18CC1  C606975E09      mov   fontcolor, 9   ; menu popup legend
190BE  C606975E09      mov   fontcolor, 9   ; controls menu - config option
19103  C606975E08      mov   fontcolor, 8   ; key config - current (border)
19431  C606975E08      mov   fontcolor, 8   ; joystick calibration - legend
1995C  C606975E08      mov   fontcolor, 8   ; gravis gamepad text
19D6B  C606975E08      mov   fontcolor, 8   ; save game - active input
19F6C  C606975E08      mov   fontcolor, 8   ; paddlewar - scores
1AB3C  C606975E01      mov   fontcolor, 1   ; default message color
1AB7C  C606975E08      mov   fontcolor, 8   ; 'quitting' message box

I thought it'd be a nice touch if the in-game status panel (the big one you bring up with the Enter key) looked more like its EGA counterpart, which meant changing some field backgrounds from black to white.  Problem was, the code sets AX to 0 (black) with "xor ax,ax" which is a two-byte instruction.  White (3) -- or any other value -- would require "mov ax,color" which is a three-byter.

Fortunately, the next argument pushed is always a byte value (and AH remains zero), so we can shave off a byte by using "mov al" instead of "mov ax", and rewrite these bits thusly:

; status panel - 'Location' background
06149  B80300          mov   ax, 3          ; 3b ;   was: xor  ax, ax   (2b)
0614C  50              push  ax             ; 1b ;        push ax       (1b)
0614D  B014            mov   al, 14h        ; 2b ;        mov  ax, 14h  (3b)

; status panel - 'Level' background
062E4  B80300          mov   ax, 3          ; 3b ;   was: xor  ax, ax   (2b)
062E7  50              push  ax             ; 1b ;        push ax       (1b)
062E8  B00A            mov   al, 0Ah        ; 2b ;        mov  ax, 0Ah  (3b)

; status panel - inventory background
0649D  B80300          mov   ax, 3          ; 3b ;   was: xor  ax, ax   (2b)
064A0  50              push  ax             ; 1b ;        push ax       (1b)
064A1  B00A            mov   al, 0Ah        ; 2b ;        mov  ax, 0Ah  (3b)

I still had a minor but annoying issue: the font color never gets reset during the demo loop, and once the high scores come up, it just blissfully XORs the text using whatever color the previous change had set.  This didn't just make the colors wrong; on composite, certain color transitions create hideous edge artifacts that make text completely unreadable.  To wit:

To get this one sorted, I had to find more patch space by moving things around (previously I was shoving my new code into the data space as well).  This allowed me to inject some code where I needed to explicitly set the font color.  As a bonus, I could now do the same with the status panel and message boxes -- as-is they both use the same color for text, but since the backgrounds colors are different, the color value for XOR has to differ as well.

; playDemoStuff +14h
03D6A  9AF013C510      call  setDemoFontFG  ; 5b ;  was: mov  ax, si    (2b)
                                                 ;       add  ax, 128Ah (3b)
                                            ; 10C5:13F0 - needs relocation!
12040  C606975E01      mov   fontcolor, 1
12045  8BC6            mov   ax, si
12047  058A12          add   ax, 128Ah
1204A  CB              retf

; drawStatusPanel +9Bh
060D3  9AE513C510      call  setPanelFG     ; 5b ;  was: mov  ax, di    (2b)
                                                 ;       add  ax, 8     (3b)
                                            ; 10C5:13E5 - needs relocation!
12035  C606975E07      mov   fontcolor, 7
1203A  8BC7            mov   ax, di
1203C  050800          add   ax, 8
1203F  CB              retf

; drawMsgBoxBG +24h
18018  9AD913C510      call  setMsgBoxFG    ; 5b ;  was: mov ax,[36D1C] (3b)
1801D  90              nop                  ; 1b ;       mov [36CDF],ax (3b)
                                            ; 10C5:13D9 - needs relocation!
12029  C606975E03      mov   fontcolor, 3
1202E  A1FCA4          mov   ax, [36D1C]
12031  A3BFA4          mov   [36CDF], ax
12034  CB              retf

As I recall, there were *still* occasional issues with the high score colors after I did that.  I forget exactly what the major malfunction was, but at this point I decided to just bring a gun to the knife fight -- set fontcolor to 1 on every fade-out, so whatever text gets printed next will have that color, unless another routine explicitly sets it to something else.  Dirty, but whatever works.

Story/help text screens

For those long text passages, the Keen Galaxy games use their own resource format with rudimentary markup features -- among them are hexadecimal color codes (^C[0-F]).  These are mapped to our familiar color array (the word-sized one) using yet another lookup table, this time at 2CE26h.  We have 16 entries here, although the CK4 texts only ever use three of them (B, E and F).  The CGA version maps all three to the same color, but that's trivial enough to rectify:

2CE3C  0B00            dw  0Bh              ; txtColorCodeLUT[11]
2CE42  0300            dw   3               ; txtColorCodeLUT[14]
2CE44  0A00            dw  0Ah              ; txtColorCodeLUT[15]

The routine that displays all this XORs the font color with the background before drawing it, but this isn't strictly necessary; you can bake that into the values chosen above, and I ended up doing that and NOP-ing this instruction out... probably because that let me trim an entry or two off my color tables, or something.  Kind of heavy-handed, but there you have it.

07627  90              nop                  ; helpTxtSub +D3h
07628  90              nop                  ; was:  xor fontcolor,3  (5b)
07629  90              nop
0762A  90              nop
0762B  90              nop

And at long last, thus ends our wrangling with the .EXE.
Well, there are a few other modifications like the initialization text and filename changes, but not much to say about those. Enough hacking for a couple weeks (and, I suspect, enough reading for a year).

Wait, there's still one small trivial matter left...

Reworking the graphics you might expect, this is what took most of the effort.  I won't launch into a technical tutorial about drawing for this mode, but there are a number of reasons why you can't just remap the palette of the EGA graphics and call it a day.

  • For a start, you don't even have a constant horizontal resolution.  Given that artifact colors are generated by manipulating pixel patterns, your effective resolution depends on the specific colors you're transitioning between at any particular point.
  • The color of a pixel depends on its horizontal position (modulo 160, which is the number of NTSC color cycles per active CGA scanline).
  • As shown, some colors will artifact badly when placed next to each other, depending (again) on the horizontal offset of the transition... in other words, you can't just have whatever color you want wherever you want, a luxury most pixel artists probably take for granted.  (Oh, you'd like a light blue pixel right next to this brown one, wouldn't you? No problem sir; enjoy your bright radioactive green. *Trollface goes here*)
  • The palette itself has a rather different relationship between the colors, compared to those RGBI colors on EGA.  It's lacking in nice pure reds, but gives you three perfectly good green hues; you get two identical-looking 50% greys, instead of that convenient 0-8-7-15 ramp; and so on so forth.
  • The 'checkerboard' dithering style (which Keen uses extensively) would transform into solid horizontal lines of alternating colors... unless you make it twice as coarse.

I could probably go on, but what this means is that practically all the graphics had to be reworked and redrawn.  Most early games that supported composite color on CGA featured coarse, blocky, low-res graphics precisely for these reasons.  One of my goals here was to avoid that, and preserve as much detail as I could; I hope I've succeeded, at least somewhat.

Importing those reworked graphics into the game may be a little easier to explain.  As mentioned in the readme, I used ModId for the job, which handles "regular" 4-color CGA graphics just fine; the reworked assets can be easily converted into a format that ModId can deal with.  Let's have a look at one:

On the left is what you see on composite; on the right is the RGBI pixel data under the hood - black and white patterns at 640x200.  I typically work in Photoshop using RGB mode, with a palette that matches the 16 target colors, and a series of fill layers that go on top of the image and selectively blend each pattern over its matching color.  This is similar to a trick mentioned in a previous post, with the bonus that I can always save a B&W RGBI picture like the one on the right, and feed it to reenigne's cga2ntsc for quick proofing.

ModId needs 320x200 4-color images, though, which isn't what we're working with.  Luckily, as far as the CGA is concerned, the memory layout is the same - except that each nybble represents two four-color dots instead of four two-color dots.  We simply have to replace each pattern with the corresponding 4-color one; once the game displays it *in mode 6*, we get just the result we need.

For instance, nybble pattern #9 (1001 in binary = white-black-black-white in mode 6) equals 21 in base 4, that is, magenta-cyan in mode 4.  By replacing the aforementioned group of fill layers to generate these patterns instead, we get a fake mode-4 image that can be saved as an indexed .BMP and imported with ModId.

That's it.  Still here?  Go convert your own Keen mod. ;-)


monkeyfungus said...

Challenge accepted.

I really think there should be a tandy mode version of Commander Keen. In fact, I was able to modify your patches to tandy mode 8, remap the colors, rework the fonts a bit and end up with a playable, albeit chunky 160x200 Keen 4 on my Tandy 1000. Tandy mode 8 and cga mode 4 have the same memory map, so the conversion is pretty straightforward.

Kind of ugly though.

What would be really cool, is getting this to work in mode 9 (320x200). Digging through your patches, however, these don't seem to sync-up with the binary locations. Disassembling KEEN4C.EXE (using objdump) or running xxd, gives locations that don't agree with the locations in the patches. I'm assuming these are in-memory addresses, or something??

So I guess the question is, how did you end up with these locations?

VileR said...

The idea of a 160x200 Tandy patch did occur to me, but the predictable chunkiness made me reject it... nice to know you've got it working though - wouldn't mind having a look myself :)

Yes, as stated, "all offsets are relative to the beginning of the load image (=file offset minus 2C00h)", so you can compute actual file offsets by adding 2C00h to the ones quoted here (addresses 0-21FFh in the .EXE contain the header and relocation table). That's how I had them in my notes, since IDA by default also calculates offsets relative to the load image rather the beginning of the file.

A mode 9 patch would be awesome - certainly more than *I* was willing to try, since it'd still be programmed CGA-like, but basically use the EGA data (or some version of it shuffled around for speed). If that happens, I'd love to see the results!

monkeyfungus said...

Ahh and you mention that offset right in the article. Reading comprehension fail. That points me in the right direction.

Captured from Dosbox but looks reasonable similar on my tandy 1000 sl with IBM 5153. Dark grey is *much* darker on real hardware, so certain instances were replaced with light grey. Mostly I spent time reworking the smaller text (menus and such, starting with the original CGA RGB fonts) they were a completely unreadable mess otherwise. I may need to remap some colors here and there, but I have a simple python script that makes this quick and easy.

As for mode 9, I could see going about this a couple ways.

Modifying the CGA keen to display mode 9 displays a weird sort of interlaced screen, which makes perfect sense since sets of lines are in the wrong memory location, with many ending up blank. I suspect that hacking modid to correctly pack the graphics into the correct format *may* be a big part of the trick, but I'm just starting to look into it.

I considered for a moment attempting to modify the EGA version to do tandy mode 9. I was able to disable the hardware checking and such, but of course this didn't work. EGA (from my cursory knowledge) is different enough at the BIOS level that there are certainly things being done here that have no way to map.

Licca said...

I start to wonder about the so-called "CKSRCMOD" code - is it able to drive CK4? If so, perhaps that code could be used to create a Tandy 320x200x16 version.