June 2008

BlueBook CompiledMethods - Having Our Cake and Eating It Too

Apologies for this appearing out of order. I should be finishing the Closures posts but I can’t for the moment. I crave your indulgence.

As I’ve said in the first post, both TSTTCPW and the Principles imply I’m keeping the existing CompiledMethod format. In Squeak, as in the Blue Book CompiledMethod is an anomalous hybrid , its first part being object references used mainly for literals, and its second part being raw bytes used mainly for bytecodes. The first word of a CompiledMethod is a SmallInteger that encodes amongst other things the size of the first part, the literal frame. This format is a good choice for an interpreter since both literals and bytecodes can be fetched from the same object, although it causes minor complications in the garbage collector since the GC must not misinterpret the bytecode as pointers.

In VisualWorks, which is a JIT on all platforms, CompiledMethod is a normal object and the bytecodes are held in a ByteArray referred to by a “bytes” instance variable. To reduce the footprint overhead of adding a separate object for bytecodes, short methods of 6 bytes of bytecode or less get their bytecodes encoded in a pair of SmallIntegers, one in the bytecode inst var and one in an additional literal. Because VisualWorks CompiledMethods are ordinary objects adding instance variables to CompiledMethod and its subclasses is directly supported by the system.

These are fine design decisions for a JIT but hopeless for an interpreter. Both interpreting the 6 bytes of bytecode in two SmallIntegers and skipping over the named instance variables on each literal access would slow an interpreter down significantly.

So the hybrid CompiledMethod format stays, but the pressure to add instance variables to CompiledMethod is high. In fact Squeak has had a per-CompiledMethod holder for extra instance variables for a while. Its called MethodProperties and while its useful it is IMO a tragic hack. I’ve always taken tragedy to mean drama driven by a character flaw - see Hamartia or Tragic Flaw. Hamlet and Othello are tragedies because events are driven by both characters’ insecurities. Titus Andronicus, on the other hand, is merely a bloodbath.

I’m willing to bet that reason for CompiledMethod having a hybrid format was to save space in the original 16-bit Xerox implementations. This hybrid format is exactly what complicates adding named instance variables and subclasses to CompiledMethod. So it is tragic to see every CompiledMethod in the system get another object (at least another 20 bytes per CompiledMethod, 4 bytes for the literal slot, 16 bytes for the MethodProperties instance) that in most cases merely adds a selector instance variable (4 bytes). So how to we square the circle?

On starting at Qwaq I immediately wanted to get to work on restructuring the compiler to allow for easy migration of bytecode sets. This would give me free rein in when I wanted to redefine the bytecode set later on and permit implementing closures in a cleaned-up compiler. But I soon found out there were two compilers, one for base Smalltalk and one for Andreas Raab’s Tweak. The Tweak compiler differs in allowing a class to define certain “instance varables” to be implemented as dynamically added properties. This is a similar idea to accessing instance variables through messages, something Gilad Bracha is quite rightly pushing in Newspeak.

In the Tweak compiler a class communicates what properties it uses by supplying a set of Field nodes, one for each property. The compiler compiles Field nodes as message sends. In the base compiler instance variables are always accessed directly using bytecodes containing the offset of a given instance varable. Other than that the differences between the compilers are minor. So the first order of business was to merge the Tweak compiler into the base compiler and restructure one compiler instead of two. In doing so I was given a working implementation of compiling accessors within the Tweak compiler. These field definitions can easily be adapted to implementing instance variables in CompiledMethod and subclasses.

A CompiledMethod’s literals occupy the first N slots following the header word. The last literal is used by the super send bytecodes to fetch the class in which the current method is defined, whose superclass is the class to begin a super send lookup. The last literal is located by the VM extracting the literal count from the header word. All other literals used by the bytecode are accessed by encoding their literal index in each bytecode as appropriate, the slot immediately following the header word being literal 0. All bytecodes are position-independent, jumps being relative. So one can add literals immediately before the last literal without invalidating the method. The system needs to update the literal count in the header to reflect the extra literal slot but otherwise a method is unaffected. So the scheme is to store instance variables of CompiledMethod and subclasses at the end of the literal frame and access them by messages.

Whitewash alert. There are some details.

Either CompiledMethod class>>newMethod:header: or its callers need to be redefined to add in the relevant number of literals for the named instance variables. Methods such as CompiledMethod>>numLiterals need to be redefined to subtract the number of named instance variables from the literal count. For example

CompiledMethod methods for accessing
numNonHeaderPointerFields
        
“Answer the number of pointer objects in the receiver.”
        
^(self header bitShift: -9) bitAnd: 16rFF
numLiterals
        
“Answer the number of literals used by the receiver.”
        
^self numNonHeaderPointerFields - self class instSize

The compiler needs to keep these accessors hidden and prevent their accidental redefinition. Something we did in Newspeak was to keep accessors out of the class’s organization. With a little polish one can easily ensure that the system does not file-out unorganized methods. Additionally the accessors should be name-mangled so their message names are not legal Smalltalk message selectors or variable names so they can’t be accidentally redefined, for example prepend an underscore. So each instance variable needs a pair of messages that look like the following. Let’s say that methodClassAssociation is the first instance variable, selector is the second, and pragmas is the third (and in a subclass of CompiledMethod). Then the accessors would be equivalent to

_methodClassAssociation
        
^self objectAt: self numNonHeaderPointerFields + 1
_methodClassAssociation: anObject
        
^self objectAt: self numNonHeaderPointerFields + 1 put: anObject
_selector
        
^self objectAt: self numNonHeaderPointerFields
_selector: anObject
        
^self objectAt: self numNonHeaderPointerFields put: anObject
_pragmas
        
^self objectAt: self numNonHeaderPointerFields - 1
_pragmas: anObject
        
^self objectAt: self numNonHeaderPointerFields - 1 put: anObject

etc…

The setters must answer the value assigned rather than self to simplify compiling
        instVar1 := instVar2 := expr

The accessors need to be created as a side-effect of the ClassBuilder redefining CompiledMethod or a subclass. setInstVarNames: seems to be the right hook here.

The ClassBuilder needs to allow the creation of subclasses of CompiledMethod, giving them an instSpec of 12. The instSpec of 12 is the magic number that the garbage collector uses to identify objects that are part pointers, part bytes. See Behavior>>instSpec and Interpreter>>formatOf:. The ClassBuilder also needs special instance mutation code for CompiledMethod and subclasses that would use objectAt: and objectAt:put: to copy state between mutated instances and keep the header up-to-date accurately reflecting the number of literals in the header word. IMO, the bulk of this code belongs on the class side CompiledMethod.
Once we have this done we can say bye bye to MethodProperties, gaining about a megabyte in a 20 megabyte image and make adding extensions such as pragmas cheap and localised.

So who wants the brush? Don’t everyone step forward at once…. Anyone? Why is everyone backing away mumbling to themselves? I have always depended on the comfort of strangers…

P.S. Anyone who does want to work on this can either contact me in email or wait a few days until I can start publishing code to a Croquet repostory near you.

2008/06/23
P.P.S. As Joshua Gargus, one of my colleagues at Qwaq, astutely pointed out today one also needs to deal with the changes to the pc in existing contexts when one shape-changes compiled methods. So one would need to enumerate all context objects to fix up their pcs and this gets tricky since those contexts could be in-use by the ClassBuilder. So some care is required to pull this off. Thanks Josh!

Cog

Comments (13)

Permalink

Closures Part I

BlueBook BlockContexts are nearly closures.  They close-over their enclosing environment, providing access to an enclosing method’s arguments and temporary variables.  But they lack their own local environment, hijacking their method activation’s (or home context’s) temps to store their own arguments and temporaries (let’s call these locals).  Worse still, they’re not reentrant.  BTW, “the home context” is the terminology for the activation record of a method.  All blocks get created within some method activation (or within a block activation nested within a method activation).  The method activation in which a block is created is called the block’s home context.

Try the following in Squeak and in VisualWorks:

        | factorial |
        factorial := [:n| n = 1 ifTrue: [1] ifFalse: [(factorial value: n - 1) * n]].
        (1 to: 10) collect: factorial


In VisualWorks you get #(1 2 6 24 120 720 5040 40320 362880 3628800).  In Squeak you get a Notifier proclaiming “Error: Attempt to evaluate a block that is already being evaluated.”  The Squeak code fails because in a Blue Book VM an unevaluated block is an activation record, a BlockContext, that has a sender field used to refer to the block’s caller.  You can inspect or explore the block to see it.  Here’s an explorer on it:

You can see the block is referrered to from the first temporary of the home context “| factorial |”.  You can see its sender field which is used to refer to the calling context, the context that sends value: to the block to evaluate it. The block can’t be reentered without overwriting the sender field, preventing return to the first caller.

OK, let’s deal with the reentrancy problem.  Simply evaluating a copy of the block instead of the original produces a fresh activation record for each activation of the block.  This is what Allen Wirfs-Brock’s team did in the Tektronix implementation on the 4404 back in the ’80’s.  They modified the BlockContext>>value[:value:...] primitives to activate a copy of the receiver.  Let’s try modifying the example with explicit copies:

        | factorial |
        factorial := [:n| n = 1 ifTrue: [1] ifFalse: [(factorial copy value: n - 1) * n]].
        (1 to: 10) collect: factorial copy #(1 1 1 1 1 1 1 1 1 1)


which answers… #(1 1 1 1 1 1 1 1 1 1) ?!?!

That’s because the above is actually compiled to code equivalent to

        | factorial n |
        factorial := [:factorialsArgument|
                n := factorialsArgument.
                n = 1 ifTrue: [1] ifFalse: [(factorial copy value: n - 1) * n]].
        (1 to: 10) collect: factorial copy


because with BlueBook BlockContexts all temporaries are stored in the temporary frame of the home context.  In the explorer above the “2: nil” field in the home context is the slot used to store the block’s argument “:n|”.

By the time we’ve recursed to the base case with n = 1 we’ve overwritten n with 1 in all the ” * n”’s, and so end up evaluating 1 * 1 * 1 ….

This way round works as intended:

        | factorial n |
        factorial := [:factorialsArgument|
                n := factorialsArgument.
                n = 1 ifTrue: [1] ifFalse: [n * (factorial copy value: n - 1)]].
        (1 to: 10) collect: factorial copy


answers #(1 2 6 24 120 720 5040 40320 362880 3628800), because in “n * (factorial copy value: n - 1)” n is pushed on the stack before the recursion due to Smalltalk’s strict left-to-right evaluation rule.

Whether you think this is a big deal or not its not ANSI-compliant so we’ve got to fix it right?  I’m all for standards, and for me not having closures is an issue.  It gets in the way of self-expression.  It is useful to have recursive blocks.  And with Smalltalk’s system building machinery it is easy to fix.  But its not a huge issue; Squeak has done without closures for thirty years.  However, there are also pragmatic reasons for fixing this.  Closures are an enabler for efficient implementation of context-to-stack mapping, something to be explained in an upcoming post, which will speed-up the interpreter as well as the fast VM.  In fact so effective is the form of this optimization that I plan to implement that when I reimplemented VisualWorks’ closures using it execution of block-intensive code such as exception handling sped up by a factor of two.  These specifics are why I chose to do my own closure implementation instead of reusing Anthony Hannan’s one.

Implementing Closures

Implementing closures is straight-forward.  We need three things.  One is to defer creating the activation of the block until we send value:.  So creating the block results in creating something, a BlockClosure, from which we can create an activation later on.  Another is the ability for a block activation to hold local arguments and temporaries.  We already have something that can do this, its called MethodContext :).  Finally we need a way of accessing locals in enclosing block or method activations.  This has already been done for Squeak.  Anthony Hannan did a closure implementation which exists in a rump form in 3.8.  But I didn’t use it because for efficient context-to-stack mapping I want one key ingredient which is to implement access to locals in enclosing activations without access through those activations.

To explain the scheme, which is used in some Lisp compilers, let’s look at the following:

counterBlock
        | count |
        count := 0.
        [ count := count + 1].


If counter is stored on the stack of the method activation of counterBlock then there’s a problem if activations are mapped to stack frames for the duration of their execution.  Up until the block is returned counter can live happily on some native stack frame created when counterBlock was sent.  But since the block [counter := counter + 1] outlives the execution of counterBlock we must preserve the value of counter for subsequent evaluations of the block.  Since we have contexts the thing to do is to create a context object and associate it with the frame when creating a block that accesses the frame, and to write back the contents of the stack frame into its context when a frame that has a context is returned from.  The sad thing is that this retun-time processing is very slow.

What we need to do is to break the dependency between the block activation and its enclosing contexts for accessing locals.  I’m going to use Collection>>inject:into: as an example.

Collection methods for enumerating
inject: thisValue into: binaryBlock
        
"Accumulate a running value associated with evaluating the argument,
        binaryBlock, with the current value of the argument, thisValue, and the
        receiver as block arguments. For instance, to sum the numeric elements of
        a collection, aCollection inject: 0 into: [:subTotal :next | subTotal + next]."

        | nextValue |
        
nextValue := thisValue.
        
self do: [:each | nextValue := binaryBlock value: nextValue value: each].
        
^nextValue


The block  [:each | nextValue := binaryBlock value: nextValue value: each] reads the method argument binaryBlock and reads and writes nextValue.  If we allocate an explicit array to hold nextValue indirectly (something I’m going to call an “indirect temp vector”) we can avoid writing the local in the method activation:

inject: thisValue into: binaryBlock
        
| indirectTemps |
        
indirectTemps := Array new: 1.
        
indirectTemps at: 1 put: thisValue. " was nextValue := thisValue."
        
self do: [:each |
                 
indirectTemps
                          
at: 1
                          
put: (binaryBlock
                                            
value: (indirectTemps at: 1)
                                            
value: each)].
        
^indirectTemps at: 1


Now the block only reads the locals of the method activation, and none of the locals it reads changes value after the block is created.  So the block can keep a private copy of those values without affecting semantics.  The compiler does this behind the scenes but the code is somewhat equivalent to

inject: thisValue into: binaryBlock
        
| indirectTemps |
        
indirectTemps := Array new: 1.
        
indirectTemps at: 1 put: thisValue.
        
self do: (thisContext
                                   
closureCopy:
                                            
[:each | | binaryBlockCopy indirectTempsCopy |
                                            
indirectTempsCopy
                                                     
at: 1
                                                     
put: (binaryBlockCopy
                                                                       
value: (indirectTempsCopy at: 1)
                                                                       
value: each)]
                                   
copiedValues: (Array with: binaryBlock with: indirectTemps)).
        
^indirectTemps at: 1


closureCopy:copiedValues: answers a BlockClosure that holds the pc in the method to start executing the block’s code (just like BlockContext’s startpc) and the array of copiedValues.  When the block is activated the copiedValues are pushed onto the activation’s stack after any block arguments, becoming locals of the activation.  Now there is no dependency on the enclosing activation for local access.  Nothing needs to happen when returning from an activation that encloses some closure so returns are simple and hence fast.

The compiler analysis to do this is extremely simple.  Any local that is accessed by an inner scope must either be copied or put in an indirect temp vector.  If a local is assigned to after it is closed over (after a block is created that accesses the local) then the local must be made indirect.  We can do slightly better than this, but its not worth the effort.  This simple analysis works fine.  I’ll detail the analysis in a post on the closure compiler and its resultant bytecodes.  Note that we only need one indirect temp vector per scope, it needs an element to each indirect local.

Prototyping The Implementation

The above facilities, creating a closure, creating an indirect temp vector and reading and writing from it, and the evaluation primitives all make sense when implemented in the VM with specific bytecodes and primitives.  But one thing that’s very nice about Smalltalk is that one can prototype the above scheme _without_ any VM modifications.  We can’t prototype non-local return wthout trickery but blocks like factorial above work fine.  I did just this before I implemented the closure bytecodes.  For example here’s the implementations of closureCopy:copiedValues:, the message one can use to create Closures, and value:, one of the evaluation primitives that can be written in pure Smalltalk due to it having first-class activation records.  Neat!

Object subclass: #BlockClosure
        
instanceVariableNames: ‘outerContext startpc numArgs copiedValues’
        
classVariableNames:
        
poolDictionaries:
        
category: ‘Kernel-Methods’

ContextPart methods for controlling
closureCopy: numArgs copiedValues: anArray
        
"Distinguish a block of code from its enclosing method by
         creating a BlockClosure for that block. The compiler inserts into all
         methods that contain blocks the bytecodes to send the message
         closureCopy:copiedValues:. Do not use closureCopy:copiedValues: in code that you write! Only the
         compiler can decide to send the message closureCopy:copiedValues:."

        ^BlockClosure new outerContext: self startpc: pc + 2 numArgs: numArgs copiedValues: anArray

BlockClosure methods for evaluating
value: anArg
        
"Activate the receiver, creating a closure activation (MethodContext)
         whose closure is the receiver and whose caller is the sender of this message.
         Supply the argument and copied values to the activation as its arguments and copied temps."

        
| newContext sz |
        numArgs
~= 1 ifTrue:
                 [
self numArgsError: 1].
        
newContext := self asContextWithSender: thisContext sender.
        
sz := copiedValues basicSize.
        
newContext stackp: sz + 1.
        
newContext at: 1 put: anArg.
        
sz > 0 ifTrue: "nil basicSize = 0"
                 [
1 to: copiedValues basicSize do:
                          
[:i| newContext at: i + 1 put: (copiedValues at: i)]].
        
thisContext privSender: newContext

BlockClosure methods for private
asContextWithSender: aContext
        
"Inner private support method for evauation. Do not use unless you know what you’re doing."

        ^(MethodContext newForMethod: outerContext method)
                 
setSender: aContext
                 
receiver: outerContext receiver
                 
method: outerContext method
                 
closure: self
                 
startpc: startpc


 
and here’s the decompilation of factorial compiled using the prototype compiler:

        | t1 |
        t1 := Array new: 1.
        t1
                 at: 1
                 put: [:t2 | t2 = 1
                                   ifTrue: [1]
                                   ifFalse: [t2
                                                     * ((t1 at: 1)
                                                                       value: t2 - 1)]].
        (1 to: 10)
                 collect: (t1 at: 1)


and its opcodes:

                 pushLit: Array
                 pushConstant: 1
                 send: #new:
                 popIntoTemp: 0
                 pushTemp: 0
                 pushConstant: 1
                 pushThisContext
                 pushConstant: 1
                 pushLit: Array
                 pushTemp: 0
                 send: #braceWith:
                 send: #closureCopy:copiedValues:
                 jumpTo: L3
                          pushTemp: 0
                          pushConstant: 1
                          send: #=
                          jumpFalseTo: L1
                          pushConstant: 1
                          jumpTo: L2
        L1:
                          pushTemp: 0
                          pushTemp: 1
                          pushConstant: 1
                          send: #at:
                          pushTemp: 0
                          pushConstant: 1
                          send: #-
                          send: #value:
                          send: #*
        L2:
                          blockReturn
        L3:
                 send: #at:put:
                 pop
                 pushConstant: 1
                 pushConstant: 10
                 send: #to:
                 pushTemp: 0
                 pushConstant: 1
                 send: #at:
                 send: #collect:
                 pop
                 returnSelf

 

and it works :)

Next post, the closure bytecodes, costs, etc.

Cog

Comments (5)

Permalink

Cog

So Hi!

I’m delighted to say that Qwaq has taken me on to write a fast Croquet VM and that the VM is to be released under the Qwaq open source licence (an MIT license).  I’m going to blog about the VM here as I implement it.  The blog is a chance for me to record design decisions as I go along, receive better ideas from you, dear reader, and to hand out the whitewash.

This VM will have a dynamic translator, or JIT for short.  It will dynamically compile Smalltalk bytecodes to machine code transparently to the programmer, and execute this machine code instead of interpreting bytecode. Initially I’m aiming at performance equivalent to the VisualWorks VM, hps.  This VM should execute pure Smalltalk code some 10 to 20 times faster than the current Squeak VM.  Subsequently I hope to do adaptive optimization and exceed VisualWorks’ performance significantly.  I expect to be able to release the first fast version within a year.

The VM is called Cog after the Honda advert.  Fast VMs tend to have something of the Heath Robinson or Rube Goldberg about them.  But one of my rules in Cog (rules are meant to be broken) is to do the simplest thing that could possibly work (TSTTCPW).  My principles for Cog, in the sense of fundamental propositions governing my behaviour, are that

- any changes in the image to accomodate the fast VM will not slow down the standard Croquet VM (the interpreter)

- any changes in the image to accomodate the fast VM will not increase overall footprint

- the fast VM will be compatible with the Hydra VM

- there will be some form of source code compatibility for Slang plugins so they can function both in the interpreter and in the fast VM

I want the Cog to be used by as many people as possible, but I realise there are those interested in small machines or in VM experimentation for whom the interpreter will still be a better choice.  I hope the principles above will avoid schism and mean the Squeak community can still use a common image format and preserve image portability.

That said I do intend to change the image format somewhat.  I’ve already implemented closures.  These, when implemented appropriately, have a significant impact on the performance of a JIT VM.  This has meant a slight change in the bytecode set; gone are the 6 experimental bytecodes 139 through 143.  I’ll blog on closures in a subsequent post quite soon.

Hydra is an interesting approach to concurrency, and something Qwaq is interested in.  BTW, I think Hydra would better be called String, strings being composed of threads or fibers.  String, like a cog, is typically a small part of a larger whole.

 

For the detail-oriented let me dive down a level and say a little more about the project, especially preparatory changes that pave the way for a fast VM.  I apologize for the density.  I’m interested in knowing what level of detail I should go into.  Obviously people who want to collaborate on Cog will want a high level of detail.  But if the architecture of a fast VM is of more general interest then I need plenty of feedback to help me explain things at the right level of detail.  Please feel free to comment on this!

OK then, some details.  Both TSTTCPW and the Principles imply keeping the existing CompiledMethod format.  In Squeak, as in Blue Book Smalltalk-80 CompiledMethod is an anomalous hybrid, its first part being object references used mainly for literals, and its second part being raw bytes used mainly for bytecodes. This is a good choice for an interpreter since both literals and bytecodes can be fetched from the same object, but it causes minor complications in the garbage collector and gets in the way of adding instance variables to CompiledMethod and hence subclassing.  I’ll blog on why I’m keeping the format and how to get round its limitations soon.

The existing bytecode set is all very well but inflexible.  It has pressing limits on the size of blocks (1023 bytes of bytecode) and the number of literals (255).  Being designed for a 16-bit system it misses some compact encoding opportunities, such as a bytecode that pushes a given SmallInteger other than -1, 0, 1 & 2.  Pushing 3 requires a minimum of 1 byte for the push literal bytecode plus 4 bytes for the literal slot holding the object 3.  A two byte bytecode could push -128 to 127 but take only 2 bytes and be faster.  Luckily migrating to a new bytecode encoding is quite easy.  Most opcodes - a name I’ll use for an abstract operation such as pushInstVar: - occur in multiple bytecodes - a name I’ll use for a specific encoding such as 0-15, 0000iiii, Push Receiver Variable #iiii.  For all short-form encodings there is an equivalent long-form, and one can modify the compiler to generate only the long forms, allowing the ranges allocated to short-form encodings to be reused.  I’ve already recompiled a 3.8 image to long-form.

One of the most important optimizations in a fast VM is that of context-to-stack mapping.  Conceptually every method or block activation in Smalltalk is represented by a context object, each of which has its own small stack holding its temporaries and intermediate results.  Each context points to its caller via a sender slot.  This has lots of advantages, such as ease of implementing the debugger, implementing an exception system, being able to implement “exotic” control structures such as coroutines and backtracking, and persisting processes, all with no VM support beyond unwind-protect.  The downsides are that a naive implementation incurs considerable overhead.  Each send involves allocating a new context and the moving of receiver and arguments from the caller to the callee context.  Each return involves (eventually) reclaiming the returned-from context, unless of course something has kept a reference to it.  Compared to stack organization in conventional languages, contexts are sloooow.

The idea in context-to-stack-mapping (you’ve guessed it, details in a subsequent post) is to house method and block activations on a conventional stack, hidden in the VM and invisible to the Smalltalk programmer.  Sends look more like conventional calls, passing arguments in the stack in the conventional way, the callee directly accessing the arguments pushed by the caller on to the stack.  The VM then creates contexts only when needed.  To the Smalltalk programmer contexts still exist, and have the same useful semantics as ever, but the overheads for the common case (send and return) are much reduced.  While context-to-stack mapping is essential to a fast VM that still provides contexts, it is also useful in an interpreter.  Andreas Raab suggested I implement context-to-stack mapping in the interpreter as a milestone on the way to the JIT VM.  I think this is a great idea.  It should provide a decent speed-up to the Squeak interpreter and help motivate things like the closure implementation.

I think that’s enough for a first post.  Let me wrap up with a road map of the project “going forward”.  None of this is set in stone, but it’s a plan.

Main line:

- restructure the bytecode compiler to decouple parse nodes from specific bytecode encodings (complete)

- replace non-reentrant BlueBook blocks with closures using the restructured compiler (~ 75% complete, currently working on temp names in the debugger)

- implement context-to-stack mapping in the interpreter (to be released Septemberish)

- implement a JIT to replace the interpreter on x86 (to be released Aprilish)

 

In parallel:

- design a bytecode set that uses prefix bytecodes to eliminate limits and provides encodings that provide benefits for 32 and 64-bit images

- provide the ability to subclass and add/remove instance variables to CompiledMethod and subclasses while retaining the compact interpreter-friendly hybrid format

- target the JIT to other ISAs such as ARM, THUMB, PowerPC and x86-64

- general VM improvements such as

- add per-object immutability support, work I’ve already done for Squeak at Cadence, and previously for the VisualWorks VM

- move garbage collection out of allocation and into the primitive failure code for new, new: basicNew basicNew: et al so that the VM doesn’t have to deal with moving pointers all over the place. - add primitive error codes so the reason for a primitive’s failure is communicated

Cog

Comments (26)

Permalink