Sometimes Smalltalk is truly awesome #1

Occasionally I find myself able to do something in Smalltalk that I think is truly fabulous, not because I’m a fabulous Smalltalk programmer, but because a live reflective pure object system provides such fabulous ease. Today was one such example. I’m working on a new code generator for the Cog VM and it (cough) has a few bugs. In debugging it I needed to see what the correct execution of the following method is for a particular input:

    Bitmap methods for filing
    decompress: bm fromByteArray: ba at: index
        "Decompress the body of a byteArray encoded by compressToByteArray (qv)...
        The format is simply a sequence of run-coded pairs, {N D}*.
            N is a run-length * 4 + data code.
            D, the data, depends on the data code...
                0    skip N words, D is absent
                    (could be used to skip from one raster line to the next)
                1    N words with all 4 bytes = D (1 byte)
                2    N words all = D (4 bytes)
                3    N words follow in D (4N bytes)
            S and N are encoded as follows (see decodeIntFrom:)...
                0-223    0-223
                224-254    (0-30)*256 + next byte (0-7935)
                255        next 4 bytes"    
        "NOTE:  If fed with garbage, this routine could read past the end of ba, but it should fail before writing past the ned of bm."
        | i code n anInt data end k pastEnd |
        <primitive: 'primitiveDecompressFromByteArray' module: 'MiscPrimitivePlugin'>
        <var: #bm declareC: 'int *bm'>
        <var: #ba declareC: 'unsigned char *ba'>
        i := index.  "byteArray read index"
        end := ba size.
        k := 1.  "bitmap write index"
        pastEnd := bm size + 1.
        [i <= end] whileTrue:
            ["Decode next run start N"
            anInt := ba at: i.  i := i+1.
            anInt <= 223 ifFalse:
                [anInt <= 254
                    ifTrue: [anInt := (anInt-224)*256 + (ba at: i).  i := i+1]
                    ifFalse: [anInt := 0.
                            1 to: 4 do: [:j | anInt := (anInt bitShift: 8) + (ba at: i).  i := i+1]]].
            n := anInt >> 2.
            (k + n) > pastEnd ifTrue: [^ self primitiveFail].
            code := anInt bitAnd: 3.
            code = 0 ifTrue: ["skip"].
            code = 1 ifTrue: ["n consecutive words of 4 bytes = the following byte"
                            data := ba at: i.  i := i+1.
                            data := data bitOr: (data bitShift: 8).
                            data := data bitOr: (data bitShift: 16).
                            1 to: n do: [:j | bm at: k put: data.  k := k+1]].
            code = 2 ifTrue: ["n consecutive words = 4 following bytes"
                            data := 0.
                            1 to: 4 do: [:j | data := (data bitShift: 8) bitOr: (ba at: i).  i := i+1].
                            1 to: n do: [:j | bm at: k put: data.  k := k+1]].
            code = 3 ifTrue: ["n consecutive words from the data..."
                            1 to: n do:
                                [:m | data := 0.
                                1 to: 4 do: [:j | data := (data bitShift: 8) bitOr: (ba at: i).  i := i+1].
                                bm at: k put: data.  k := k+1]]]

From my VM simulator I printed out the ByteArray that was being decompressed incorrectly by the jitted code generated by my new code generator (the new code was invoking primitiveFail in the above).  So now I wanted to trace through execution of the above in the debugger so I could see what the values of local variables should be in a correct execution to identify the location of the error as opposed to the location of its symptom.  So I naïvely debug evaluated the following:

    Bitmap decompressFromByteArray: #[ 16r10 16r27 16rC0 16r0 16r0 16r0 16rE0 16r0 16r0 16r0 16rF0 16r0 16r0 16r0 16rF8 16r0
	        16r0 16r0 16rFC 16r0 16r0 16r0 16rFE 16r0 16r0 16r0 16rFF 16r0 16r0 16r0 16rFF 16r80
	        16r0 16r0 16rFF 16r0 16r0 16r0 16rA 16rFE 16r0 16r0 16r0 16rB 16rCF 16r0 16r0 16r0
	        16rF 16r0 16r0 16r0 16rA 16r7 16r80 16r0 16r0 16r7 16r3 16r0 16r0 16r0]

and stepped into the activation of the following method:

    Bitmap class methods for instance creation
    decompressFromByteArray: byteArray
        | s bitmap size |
        s := ReadStream on: byteArray.
        size := self decodeIntFrom: s.
        bitmap := self new: size.
        bitmap decompress: bitmap fromByteArray: byteArray at: s position+1.
        ^ bitmap

So far so good. But when I tried to step into the Bitmap>>#decompress:fromByteArray:at: method the debugger of course evaluated the primitive that is in the VM, primitiveDecompressFromByteArray (which just happens to get generated from the above by the VMMaker, but that’s a different story). What the debugger didn’t do, quite correctly, is evaluate the non-primitive method, starting at i := index. "byteArray read index", the actual Smalltalk code whose execution I wanted to observe.

This is where things turn awesome. Since Smalltalk has first-class activation records I can actually create an activation of the above method poised to start execution at the first bytecode, hence after the evaluation of the primitive. Here’s how. In the debugger window I evaluated

    thisContext swapSender: (MethodContext
                                            sender: ThisContext
                                            receiver: bitmap
                                            method: (Bitmap>>#decompress:fromByteArray:at:)
                                            arguments: {bitmap. byteArray. s position + 1 }).
    self halt

What’s going on here?  First, the debugger allows me to access the local variables of the activation I’m debugging, from within a script evaluated in the debugger (neat!). If you look at the script in the debugger you see it translated into:

    DoItIn: ThisContext 
            swapSender: (MethodContext
                    sender: ThisContext
                    receiver: (ThisContext namedTempAt: 3)
                    method: Bitmap >> #decompress:fromByteArray:at:
                    arguments: {ThisContext namedTempAt: 3. ThisContext namedTempAt: 1. (ThisContext namedTempAt: 2) position + 1}).
        ^ self halt

But the clever bit is what the script does. ThisContext with capitals is the debugger’s name for the activation of decompressFromByteArray: I had stepped into; its simply the argument to the DoItIn: method created to run my script.  thisContext without capitals is the name for the current activation (any activation) and so refers to the activation of DoItIn:.  The (MethodContext sender:…) expression creates an activation record on the method (Bitmap>>#decompress:fromByteArray:at:), and initializes the activation at the start of the bytecodes for the method, effectively after the primitive it contains.  The thisContext swapSender: … expression substitutes this activation as the one to return to from thisContext.  i.e. my script "thisContext swapSender: (MethodContext…" etc will now return into the activation of (Bitmap>>#decompress:fromByteArray:at:) I just created instead of back to the compiler that compiled and evaluated the script.  The "self halt." brings up the debugger.  So now I simply step out of the halt, into my script, stepping until it returns into the method I wanted to evaluate.  Awesome!

   Send article as PDF