Sep 29 2014 CogVM binaries as per VMMaker.oscog-eem.887/r3085 Fix regression in V3 become youngReferrers management in VMMaker.oscog-eem.882. We /must/ prune young referrers if mapObjectReferencesInMachineCodeForBecome removes a cog method from youngReferrers because it may get added back and youngReferrers cannot contain duplicates. Fix potential bug in Spur become argument validation. Check for immediates needs to come /after/ following forwarders since an object can become-forward to an immediate. Fix cPICHasForwardedClass:; this did /not/ enumerate the classIndices in a Spur Closed PIC. Install the callPrimitiveBytecode in the Interpreter's bytecodeDispatchTable on Spur. Modify callPrimitiveBytecode to not invoke unknownBytecode processing if at the first bytecode of a primitive method. Correct sign comparison of instructionPointer in justActivateNewMethod. Relax the validInstructionPointer:inMethod:framePointer: assert to accept any pc in initialPC to self size range now that callPrimitiveBytecode is more forgiving. Speed up primitiveMarkUnwindMethod & primitiveMarkHandlerMethod in the StackInterpreter by setting them to 0 in the primitive table. ------------------------------- Sep 26 2014 CogVM source as per VMMaker.oscog-eem.886/r3083 Fix regression in V3 become youngReferrers management in VMMaker.oscog-eem.882. We /must/ prune young referrers if mapObjectReferencesInMachineCodeForBecome removes a cog method from youngReferrers because it may get added back and youngReferrers cannot contain duplicates. Spur: Fix bug in sweepToFollowForwardersForPigCompact which failed to update and answer lowest forwarder. Fixes numForwarders == 0 assert failures. Fix bug with become and class table not removing classes which become causes them to be unhashed. Fix bogus assert fail in synchronousSignal:. Refactor following code into ensureSemaphoreForwardedThroughContext: and fix the assert there-in. Use rawOverflowSlotsOf: in bytesInObject:. Make a little more progress on Spur image segment support. Provide a classTableEntriesDo: and use it to compute an arrayOfUnmarkedClasses. Cogit: Make sure voidImplicitReceiverCacheAt: sets codeModified if IRCs are inline. Make sure freeMethod: clears cmRefersToYoung Fix assert in cogitPostGCAction: that would fire erroneously in Spur become. Move evaluation of Cogit primitive desacriptor enabled function from initialization to just-in-time, and add enablers on SmallInteger primitives to ensure they are applied only to SmallInteger receivers (falling back to interpreter prims if not). Hence fix Cog for 4.1 (e.g. MuO) images. Back out of saneMethodClassAssociation assert in initialPCForHeader:method:. Newspeak falls foul of this. Make sure that the method is set on returning from a callback. Fixes MNUs of atAddress: in exampleCqsort in the stack VMs. Fix bugs in scripts and Windows manifests. ------------------------------- Sep 8 2014 CogVM binaries as per VMMaker.oscog-eem.876/r3072 Fix bad regression in FilePlugin>>primitiveFileSetPosition introduces around r2797 that breaks > 1Gb file access. Change manifest files on Windows to make app HIGHDPIAWARE. Create a manifest for the consolevm and include it in the archive. Fix bug with become where duplicate entries in the input array would crash the system (thanks Igor). Make nameOfClass: more robust, in both the real and simulated VMs. Regenerate sources after refactoring Slang code to ask classes what methods they want to include based on their initializationOptions inst var instead of the centralized VMMaker. Generate Spur Newspeak VMs with the new CompiledMethod header format. This will break the Newspeak Spur bootstrap until NOF generation and/or loading is fixed. ------------------------------- Aug 14 2014 CogVM binaries as per VMMaker.oscog-eem.860/r3063 Spur: Set VM version to 5. Convert Spur to use only the alternate CompiledMethod header format (65536 literals, primitive in a bytecode). 3+4 evaluated in the simulator. Update build scripts to set VM version number to 5.0 (V3 remain at 4.0). Fix system crash on using basicNew: on CompiledMethod. Have instantiateClass:indexableSize: refuse to instantiate a CompiledMethod, and add instantiateCompiledMethodClass:indexableSize:. Get the error code right for primitiveNew et al.` Alter the SqueakV3 Cogit bytecode table initializers to add the callPrimitiveBytecode: at 139 if in a Spur VM. Move literal count methods from StackInterpreter hierarchy to ObjectMemories. Rename headerOf: to methodHeaderOf: & literalCountOfHeader: to literalCountOfMethodHeader: Abstract out lastPointerOfMethodHeader: Cogits: Eliminate classFloatCompactIndex and just use ClassFloatCompactIndex directly. Slang: Eliminate double negatives translating ifFalse: (for primErrorCode). *No new Newspeak VMs until the bootstrap is fixed for Spur. ------------------------------- Aug 6 2014 CogVM binaries as per VMMaker.oscog-eem.844/r3060 Fix bad bug in interpreter to machine code frame conversion on backward branch. Old code would decrement branch even if backward branch checked for events and did a process switch, potentially converting frames at arbitrary unmappable pcs, not just backward branches. Neaten and comment the code. Neaten and update the pc mapping tests for multiple bytecode sets and for sets with extensions. Change default count to 40 from 10 to reduce number of startup methods jitted. Allow warnings to be treated as errors, adding -blockonwarn flag on Mac & Unix. ------------------------------- Aug 4 2014 CogVM binaries as per VMMaker.oscog-eem.842/r3058 Spur: Provide unpinObject: for InterpreterProxy. Fix initialization of the heap-resident remembered set added by VMMaker.oscog-eem.827. It must be created /after/ old space is initialized. Fix two become bugs surfaced when adding/removing inst vars to/from Association, Binding et al. First, the class table must be scanned to ensure there are no forwarders to classes (much cheaper than the full hierarchy walk to follow method dictionaries etc that was done). Second, machine code methods that gain a new reference through become must get added to the youngReferrers. Add a new become effect flag, OldBecameNew that captures this and respond to it in CoInterpreter>>postBecomeAction: by adding all methods to youngReferrers so that on the next scavenge all will be made right. Fix GC of machine code, which must follow forwarders when doing markAndTraceLiteral: and again add to youngReferrers if following gains a new ref. Refactoring of markAndTraceLiteral: into markAndTraceLiteral:in:at: et al required. Fix assert in addFreeSubTree:. General: Support the alternate bytecode set header in all VMs to ease testing of multiple bytecode sets. This means methods with the sign bit set have no primitive field and a larger num literals field, but no more. Fix longPrintOop: (actually printOopShortInner:) for global variable printing in face of new Environments. Cogit: Clean up adding to the youngReferrers by providing ensureInYoungReferrers:. Since there always is room on youngReferrers, nuke roomOnYoungReferrersList, canLinkToYoungClasses and caller code, simplifying ceSend:super:to:numArgs: etc. Fix assert in followForwardedLiteralsIn: General: Change the scanning for initial nils scheme in the StackToRegisterMappingCogit to answer the number of push nils in a bytecode, instead of whether the bytecode is a push nil. Refactor genReturnTopFromBlock into genBlockReturn. These changes accomodate Sista. Rationalize the length functions, deleting byteLengthOf:, fetchLong32LengthOf: & fetchWordLengthOf: and providing numBytesOf:, num16BitUnitsOf:, num32BitUnitsOf:, num64BitUnitsOf: and numBytesOf:. Provide fetch/storeShort16:ofObject:[withValue:] and fetch/storeLong64:ofObject:[withValue:]. ------------------------------- Jul 24 2014 CogVM binaries as per VMMaker.oscog-eem.832/r3056 Add the time zone to the version info on Mac OS X and Win32. Fix bug in assigning parameter 55 (growth ratio at which to do a global GC). Add vmParameter 52 to answer the capacity of the root table (in Spur a.k.a. the rememberedSet). In the wake of the inlining change below (see Slang:), split lookupInMethodCacheSel:classTag: into inlineLookupInMethodCacheSel:classTag:, and use the inline version in internalFindNewMethod. Fix printStringOf: (used in e.g. frame print) to not print crs that would cause previous info to be overwritten. Spur: Fix bug with class table management and two-way become. Because two-way become may do an in-place swap obj1 & obj2 in SpurMemoryManager>>doBecome:and:copyHash: may not be forwarded after the inner become. Hence they should not be followed if not forwarded. The bug manifested as Object's identityHash changing: superclass is the first slot in a class. Following an unforwarded subclass of object yields Object. Setting the hash bits of the followed object smashes Object's identityHash. Thanks to Stephane Rollandin for finding the bug. More rationalization of the class table management post become. Now no post become scan of the class table is necessary at all. Spur: Add vm parameter 53 to answer the number of segments. Move the rememberedSet into a pinned object in oldSpace. Allow it to grow on demand, starting at 1k entries, doubling on each grow. Make sure to abort if the attempt to grow the remembered set fails. Try to grow by another 1k slots if doubling fails, then abort. Revise markAndTrace: given that markAndShouldScan: is inlined within it (see Slang changes below). Move the ephemeron processing into markAndShouldScan: out of the now unused numStringSlotsOf:ephemeronInactiveIf: circumlocution. Add activeAndDeferredScan: and numStrongSlotsOfInephemeral: in place of the double negative inactiveOrFailedToDeferScan: and hence inline numStrongSlotsOfInephemeral:. Increase the traceImmediatelySlotLimit. These changes plus the 2 repeats for compaction speed up global GC by at least x2. Change the defaultEdenBytes to 4Mb. Allow the number of compaction passes to vary, 2 on GC, 3 on GC for snapshot. Slang: Add support for inlining into the condition of ifTrue:/ifFalse: when it is marked as inline. Transform expr1 ifTrue:/ifFalse: [^expr2] by inlining ^expr2 into expr1. Transform expr ifTrue:/ifFalse: [statements] by replacing ^boolean occurrences in expr with gotos. ------------------------------- Jul 18 2014 CogVM source as per VMMaker.oscog-eem.826/r3048 Fix the ZipPlugin (InflatePlugin&DeflatePlugin) to no longer depend on specific instance sizes for ReadStream and WriteStream which allows some leniency in redefining these classes. Fixes occasionaly Monticello commit bugs with large packages after loading Collections-eem.567. Fix compilation warnings in stringForCString: Add a -warnpid flag that causes warning to print the pid; useful in debugging multi-image tests (i.e. magma). Reimplement the backward count in interpreted methods, storing the count in IFrameFlags, hence eliminating the requirement that the jump happens consecutively N times in just that method with no backward jumps in any other method. The Chameneos benchmark has this pattern and so two mehtods don't get jitted. This eliminates lastBackwardJumpMethod & backwardJumpCount. Spur: Fix bad bug in remapObj: that confused the test for old vs new and hence tried to copyAndForward old objects. Fix bug in processWeaklings that could remember a weak object twice. Mark the methods used by SpurMemoryManager>>globalGarbageCollect as inline: #never for profiling. Spur Newspeak: Initialize numIRCs before checking for quick prims. ------------------------------- Aug 14 2014 r3063 | eliot | 2014-08-14 12:43:38 -0700 (Thu, 14 Aug 2014) | 39 lines CogVM source as per VMMaker.oscog-eem.860 (except for Newspeak VMs *). Spur: Convert Spur to use only the alternate CompiledMethod header format (65536 literals, primitive in a bytecode). 3+4 evaluated in the simulator. Update build scripts to set VM version number to 5.0 (V3 remain at 4.0). Fix system crash on using basicNew: on CompiledMethod. Have instantiateClass:indexableSize: refuse to instantiate a CompiledMethod, and add instantiateCompiledMethodClass:indexableSize:. Get the error code right for primitiveNew et al.` Alter the SqueakV3 Cogit bytecode table initializers to add the callPrimitiveBytecode: at 139 if in a Spur VM. Update the prim numbers for inlined prims in genCallPrimitiveBytecode to match the latest EncoderForSistaV1 class comment. Moved literal count methods from StackInterpreter hierarchy to ObjectMemories. Rename headerOf: to methodHeaderOf: & literalCountOfHeader: to literalCountOfMethodHeader: Abstract out lastPointerOfMethodHeader: Cogits: Fix offset of first special selector to not presume SqueakV3 and/or NewspeakV4. Sista: Update build scripts to set VM version number to 6.0 (Sista V3 i set ot 4.5). Partially implement genCallPrimitiveBytecode. Slang: Eliminate double negatives translating ifFalse: (for primErrorCode). Scripts: Update getGoodVM scripts, and add a maker for getting Newspeak VMs. Update envvars to use provided curl on Mac OS X instead of non-native wget. Add rev and tag args to the mkarchive and uploadvm scripts. Fix uploadvms script to upload Newspeak Spur Windows VM. *Newspeak Spur VMs appear funcitonal but not checking in because the bootstrap is not finished and taking too long to fix. ------------------------------------------------------------------------ r3061 | eliot | 2014-08-07 08:22:32 -0700 (Thu, 07 Aug 2014) | 20 lines CogVM source as per VMMaker.oscog-eem.848 Sista: Implement ceClassTrap. Fix argument access in inlinePrimitiveBytecode:. Refactor primitiveQuo to share code with inlinePrimitiveBytecode:. Provide a dummy genCallPrimitiveBytecode for now. Cogits: Implement genExtTrapIfNotInstanceOfBehaviorsBytecode for 32 bit Spur. Fix argument branches type & alloca in genExtTrapIfNotInstanceOfBehaviorsBytecode. Eliminate classFloatCompactIndex and just use ClassFloatCompactIndex directly. Fix comment typos and miscategorizations. Copy the readme to the paste buffer on Mac OS X in uploadvms ------------------------------------------------------------------------ r3060 | eliot | 2014-08-06 17:37:34 -0700 (Wed, 06 Aug 2014) | 19 lines CogVM source as per VMMaker.oscog-eem.844 Fix bad bug in interpreter to machine code frame conversion on backward branch. Old code would decrement branch even if backward branch checked for events and did a process switch, potentially converting frames at arbitrary unmappable pcs, not just backward branches. Neaten and comment the code. Neaten and update the pc mapping tests for multiple bytecode sets and for sets with extensions. Change default count to 40 from 10 to reduce number of startup methods jitted. Nuke obsolete PRIM_TABLE code form the sqGnu.h's. Spur: Eliminate expensive asserts in Spur allObjects/alInstances unnecessary in MarkObjectsForEnumerationPrimitives false regime. Sista: Implement genExtTrapIfNotInstanceOfBehaviorsBytecode for SqueakV3. And fix it and the interpreter's version to pop the value from the stack. ------------------------------------------------------------------------ r3059 | eliot | 2014-08-05 10:16:14 -0700 (Tue, 05 Aug 2014) | 4 lines CogVM source as per VMMaker.oscog-eem.843 Allow warnings to be treated as errors, adding -blockonwarn flag on Mac & Unix. ------------------------------------------------------------------------ r3058 | eliot | 2014-08-04 19:15:28 -0700 (Mon, 04 Aug 2014) | 4 lines CogVM source as per VMMaker.oscog-eem.842 Fix typo in initializeCTranslationDictionary and hence rescue Newspeak cogits ------------------------------------------------------------------------ r3057 | eliot | 2014-08-04 18:55:33 -0700 (Mon, 04 Aug 2014) | 56 lines CogVM source as per VMMaker.oscog-eem.841 Spur: Provide unpinObject: for InterpreterProxy. Fix initialization of the heap-resident remembered set added by VMMaker.oscog-eem.827. It must be created /after/ old space is initialized. Fix two become bugs surfaced when adding/removing inst vars to/from Association, Binding et al. First, the class table must be scanned to ensure there are no forwarders to classes (much cheaper than the full hierarchy walk to follow method dictionaries etc that was done). Second, machine code methods that gain a new reference through become must get added to the youngReferrers. Add a new become effect flag, OldBecameNew that captures this and respond to it in CoInterpreter>>postBecomeAction: by adding all methods to youngReferrers so that on the next scavenge all will be made right. Fix GC of machine code, which must follow forwarders when doing markAndTraceLiteral: and again add to youngReferrers if following gains a new ref. Refactoring of markAndTraceLiteral: into markAndTraceLiteral:in:at: et al required. Fix assert in addFreeSubTree:. General: Support the alternate bytecode set header in all VMs to ease testing of multiple bytecode sets. This means methods with the sign bit set have no primitive field and a larger num literals field, but no more. Fix longPrintOop: (actually printOopShortInner:) for global variable printing in face of new Environments. Cogit: Clean up adding to the youngReferrers by providing ensureInYoungReferrers:. Since there always is room on youngReferrers, nuke roomOnYoungReferrersList, canLinkToYoungClasses and caller code, simplifying ceSend:super:to:numArgs: etc. Fix assert in followForwardedLiteralsIn: General: Change the scanning for initial nils scheme in the StackToRegisterMappingCogit to answer the number of push nils in a bytecode, instead of whether the bytecode is a push nil. Refactor genReturnTopFromBlock into genBlockReturn. These changes accomodate Sista. Rationalize the length functions, deleting byteLengthOf:, fetchLong32LengthOf: & fetchWordLengthOf: and providing numBytesOf:, num16BitUnitsOf:, num32BitUnitsOf:, num64BitUnitsOf: and numBytesOf:. Provide fetch/storeShort16:ofObject:[withValue:] and fetch/storeLong64:ofObject:[withValue:]. Sista: Fix slip in CoInterpreter>>ceCounterTripped: that would break Spur (classForClassTag: instead of classTagForClass:). ------------------------------------------------------------------------ r3056 | eliot | 2014-07-24 10:12:40 -0700 (Thu, 24 Jul 2014) | 2 lines Fix linux's makeproduct script ------------------------------------------------------------------------ r3055 | eliot | 2014-07-24 10:11:23 -0700 (Thu, 24 Jul 2014) | 3 lines Add the time zone to the version info on Mac OS X and Win32. Nuke the win32 version.h wart and simplify & update win32's version.c. ------------------------------------------------------------------------ r3054 | eliot | 2014-07-24 01:23:09 -0700 (Thu, 24 Jul 2014) | 11 lines CogVM source as per VMMaker.oscog-eem.832 Spur: More rationalization of the class table management post become. Now no post become scan of the class table is necessary at all. Fix bug in assigning parameter 55 (growth ratio at which to do a global GC). Fix printStringOf: (used in e.g. frame print) to not print crs that would cause previous info to be overwritten. ------------------------------------------------------------------------ r3052 | eliot | 2014-07-22 10:45:31 -0700 (Tue, 22 Jul 2014) | 13 lines CogVM source as per VMMaker.oscog-eem.831 Spur: Fix bug with class table management and two-way become. Because two-way become may do an in-place swap obj1 & obj2 in SpurMemoryManager>>doBecome:and:copyHash: may not be forwarded after the inner become. Hence they should not be followed if not forwarded. The bug manifested as Object's identityHash changing: superclass is the first slot in a class. Following an unforwarded subclass of object yields Object. Setting the hash bits of the followed object smashes Object's identityHash. Thanks to Stephane Rollandin for finding the bug. ------------------------------------------------------------------------ r3051 | eliot | 2014-07-22 09:38:54 -0700 (Tue, 22 Jul 2014) | 41 lines CogVM source as per VMMaker.oscog-eem.830 In the wake of the inlining change below, split lookupInMethodCacheSel:classTag: into it and inlineLookupInMethodCacheSel:classTag:, and use the inline version in internalFindNewMethod. Add vmParameter 52 to answer the capacity of the root table (in Spur a.k.a. the rememberedSet). Delete all the explicit initializing to nil in primitiveVMParameter since the array is instantiated normally (filled with nil). Spur: Add vm parameter 53 to answer the number of segments. Move the rememberedSet into a pinned object in oldSpace. Allow it to grow on demand, starting at 1k entries, doubling on each grow. Make sure to abort if the attempt to grow the remembered set fails. Try to grow by another 1k slots if doubling fails, then abort. Revise markAndTrace: given that markAndShouldScan: is inlined within it (see Slang changes below). Move the ephemeron processing into markAndShouldScan: out of the now unused numStringSlotsOf:ephemeronInactiveIf: circumlocution. Add activeAndDeferredScan: and numStrongSlotsOfInephemeral: in place of the double negative inactiveOrFailedToDeferScan: and hence inline numStrongSlotsOfInephemeral:. Increase the traceImmediatelySlotLimit. These changes plus the 2 repeats for compaction speed up global GC by at least x2. Change the defaultEdenBytes to 4Mb. Allow the number of compaction passes to vary, 2 on GC, 3 on GC for snapshot. Slang: Add support for inlining into the condition of ifTrue:/ifFalse: when it is marked as inline. Transform expr1 ifTrue:/ifFalse: [^expr2] by inlining ^expr2 into expr1. Transform expr ifTrue:/ifFalse: [statements] by replacing ^boolean occurrences in expr with gotos. ------------------------------------------------------------------------ r3048 | eliot | 2014-07-18 17:42:28 -0700 (Fri, 18 Jul 2014) | 22 lines CogVM source as per VMMaker.oscog-eem.826 Fix the ZipPlugin (InflatePlugin&DeflatePlugin) to no longer depend on specific instance sizes for ReadStream and WriteStream which allows some leniency in redefining these classes. Fixes occasionaly Monticello commit bugs with large packages after loading Collections-eem.567. Fix compilation warnings in stringForCString: Add a -warnpid flag that causes warning to print the pid; useful in debugging multi-image tests (i.e. magma). Spur: Fix bad bug in remapObj: that confused the test for old vs new and hence tried to copyAndForward old objects. Fix bug in processWeaklings that could remember a weak object twice. Misc: Nuke the sistasrc/plugins directory that was a symbolic link to src/plugins. We deal with this in the makefiles these days. ------------------------------------------------------------------------ r3047 | eliot | 2014-07-16 17:21:34 -0700 (Wed, 16 Jul 2014) | 5 lines CogVM source as per VMMaker.oscog-eem.824 Mark the methods used by SpurMemoryManager>>globalGarbageCollect as inline: #never for profiling. ------------------------------------------------------------------------ r3043 | eliot | 2014-07-16 13:07:09 -0700 (Wed, 16 Jul 2014) | 10 lines CogVm source as per VMMaker.oscog-eem.823 Reimplement the backward count in interpreted methods, storing the count in IFrameFlags, hence eliminating the requirement that the jump happens consecutively N times in just that method with no backward jumps in any other method. The Chameneos benchmark has this pattern and so two mehtods don't get jitted. This eliminates lastBackwardJumpMethod & backwardJumpCount. Add support for not inlining in gcc under optimization (for profiling). ------------------------------------------------------------------------ r3042 | eliot | 2014-07-16 11:49:18 -0700 (Wed, 16 Jul 2014) | 4 lines CogVM source as per VMMaker.oscog-eem.822 Initialize numIRCs before checking for quick prims. ------------------------------------------------------------------------ r3041 | eliot | 2014-07-15 12:11:00 -0700 (Tue, 15 Jul 2014) | 46 lines CogVM source as per VMMaker.oscog-eem.820 Put the handling of the cloning of cogged methods in the clone: implementations, removing it from the primitive. Add it to the pinning clone too. Specialize the store check trampoline generation. Move it down to the relevant object representations. Move setting of isRemembered flag to true into SpurGenerationScavenger>> remember:. Inline possibleRootStoreInto: (given that remember: is /not/ inlined. Call remember directly from the ceStoreCheckTrampoline, and hence have remember: answer its argument. Fix localization bug. Variables in initialize methods were not considered references (cuz there are excluded). This caused VMMaker.oscog-eem.816's extraction of zero/false vars to StackInterpreter>>#initialize to cause nextPollUsecs to be localized to checkForEventsMayContextSwitch:. Spur: Fix bug in scanClassPostBecome:effects: with new lazy selector following policy by... throwing it all away. The read barriers on method lookup (of the methodClass association in super sends, of the superclass link, of method dictionaries, method dictionary arrays, selectors and methods is cheap. So replace scanning classes and method dictionaries in the class table post become with read marriewrs on methodClass, superclass and method dictionary etc on lookup. The read barrier on an object from which we are going to fetch state (such as a class or method dictionary) is essentially free on modern machines because the class index and the state very likely share a cache line, and the register code for testing is so cheap compared to memory access. Further the read barrier on selectors is cheap because the method lookup cache is effective in reducing the number of message lookups and because nil entries need no check. So nuke all the followNecessaryForwardingInMethod: machinery including the cmUsesMethodClass hack. Nuke scanClassPostBecome:effects:. Rip out the forwardingCount: measurement code. It causes bad performance regressions (due to failing inlines?) Clean up, e.g. replace followNonImmediateField:ofObject: uses with followObjField:ofObject:. canPinObjects can be inlined. ------------------------------------------------------------------------ r3040 | eliot | 2014-07-11 10:25:27 -0700 (Fri, 11 Jul 2014) | 27 lines CogVm source as per VMMaker.oscog-eem.816 Include the Systemd socket support by Max Leske & Nik Lutz on Unix and Mac OS X. Spur: Fix bugs in clone:; allocateSlots:... may fail and cloning compiled methods still needs a store check. Fix become performance issue of following possibly becommed selectors by adding a read barrier to lookupMethodInDictionary: et al. This is much cheaper than following all dictionaries in the classTable post become. Control the policy with a class var in the hope that an efficient eager solution can be found. Add stats that count the causes of followForwardedObjectFields:toDepth: (used to track down the above issue). Make optional via a class var. Move stringForCString: from StackInterpreter to the object memories. Reposition ensureNoForwardedLiteralsIn: and replace cePositive32BitIntegerFor: with positive32BitIntegerFor:. Remove the assert check in isForwarded: to make sure it is inined. Make sure possibleRootStoreInto: is not inlined. StackInterpreter: Extract the vars initialized to zero or false to an initialize method from initializeInterpreter:. ------------------------------------------------------------------------ r3039 | eliot | 2014-07-09 12:37:09 -0700 (Wed, 09 Jul 2014) | 4 lines CogVM source as per VMMaker.oscog-eem.812 Fix bogus assert fail due to signedness. ------------------------------------------------------------------------ r3038 | eliot | 2014-07-09 12:14:36 -0700 (Wed, 09 Jul 2014) | 25 lines CogVM source as per VMMaker.oscog-eem.811 Spur: Fix bug in old space GC processing of weaklings. Old code didn't trace strong references in weaklings to weaklings in markWeaklingsAndMarkAndFireEphemerons. Make sure nilUnmarkedWeaklingSlotsIn: can be inlined. Bug shows up as crashes in Pharo Spur, Pharo making much more use of weakness than Squeak or Newspeak. Move the scanning for young references in scavenger processing of weaklings into processWeakSurvivor:. Fix minor slips in allObjects & allInstancesOf: which should only empty weaklingStack if marking. Fix a couple of storePointer:ofObject:'s being applied to objStacks. Rename isReallyForwarded: to isUnambiguouslyForwarder: and add an assert to isForwarded: to catch accidental applications to free objects. Fix longPrintOop: for free referents. Slang: Assign complex expressions to loop variables in value: expansions. Old code would replace variable with expansion of actual parameter everywhere. ------------------------------------------------------------------------ r3037 | eliot | 2014-07-08 18:16:16 -0700 (Tue, 08 Jul 2014) | 5 lines CogVM source as per VMMaker.oscog-eem.810 Spur: Fix bug in nilUnmarkedWeaklingSlotsIn: which was not guarding the isForwarded: check with an isFreeObject: not check. Premature optimizations... ------------------------------------------------------------------------ r3034 | eliot | 2014-07-07 14:05:51 -0700 (Mon, 07 Jul 2014) | 18 lines CogVM source as per VMMaker.oscog-eem.808 Change implementation of the implicit receiver trampoline to cache the class tag, not the class object (thanks Ryan). Has a significant impact on Newspeak Spur performance. Refactor getInlineCacheClassTagFrom:into: into genGetInlineCacheClassTagFrom:into:forEntry: and add inlineCacheTagForClass: to support this. Change the V3 inline cache check to not shift the compact class index (thanks Tim). Saves a byte and an instruction from the entry sequence on x86. Issue a prefetch for Sista counters after frame build. Make genSmallIntegerComparison:orDoubleComparison: observe hasDoublePrecisionFloatingPointSupport (for ARM). Finally rename ClassInteger to ClassSmallInteger. ------------------------------------------------------------------------ r3033 | eliot | 2014-07-06 10:42:23 -0700 (Sun, 06 Jul 2014) | 4 lines CogVM source as per VMMaker.oscog-eem.805 Oops. resetCountersIn: must of course be optional. ------------------------------------------------------------------------ r3032 | eliot | 2014-07-05 21:01:13 -0700 (Sat, 05 Jul 2014) | 31 lines CogVM source as per VMMaker.oscog-eem.804 Cogit: Change the management of counters in Sista methods to hold the counters well away from code. This restores the performance of the Sista VMs on x86, where when counters are close to code the writes to counters flush the instruction cache and destroy performance. In V3 use malloc to allocate and free them (this means Sista V3 is not currently simulable). In Spur, store them in oldSpace in pinned objects (hence Sista Spur simulates just fine). Split the enilopmarts into those that are used in the context of a call (entering code as if from a call) and those used in other contexts (converting interpreted method to machine code method in loops, returning from interpreter method to machine code one, etc). This fixes enilopmarts on RISCs (i.e. ARM) where the ret pc must be popped into the LinkReg when entering machine code as if from a call, but not when entering in other contexts (returning from inter- preter to machine-code method, converting interpreter method to machine code in a loop, etc). Nuke a couple of unused variables in Cogit hierarchy. Spur: Fix bug in allocateSlotsForPinningInOldSpace:bytes:format:classIndex: and make sure answered object is actually pinned. Hence fix pinObject: & primitivePin. Plugins: Fix access to the characterTable in the ThreadedFFIPlugins. Replace characterTable at: with characterObjectOf:, and in Spur support wide characters. Fix divide-as-shift issue in BalloonEnginePlugin. ------------------------------------------------------------------------ r3031 | eliot | 2014-07-03 06:49:59 -0700 (Thu, 03 Jul 2014) | 7 lines CogVM source as per VMMaker.oscog-eem.800 Sort VM methods by class first, selector second, to group by coarse functionality (e.g. scavenger methods) for the benefit of the VMProfiler. Access missOffset via a macro (very minor speedup). ------------------------------------------------------------------------ r3029 | eliot | 2014-07-02 19:52:10 -0700 (Wed, 02 Jul 2014) | 4 lines CogVM source as per VMMaker.oscog-eem.797 Nuke a halt from testing the simulator... ------------------------------------------------------------------------ r3028 | eliot | 2014-07-02 19:16:19 -0700 (Wed, 02 Jul 2014) | 13 lines CogVM source as per VMMaker.oscog-eem.796 Spur: Implement forwarder following on primitive failure for sideways calls from machine code. Fix LargeIntegersPlugin>>isNormalized: for forwarders, no longer assuming that if its arg isn't a SmallInteger it must be a large integer. Squash an assert fail in lengthOf:format: on forwarders by using numSlotsOfAny:. Make sure a forwarder has an accurate slot count, bumping it to 1 if it was zero. ------------------------------------------------------------------------ r3026 | eliot | 2014-07-02 15:44:11 -0700 (Wed, 02 Jul 2014) | 12 lines CogVM source as per VMMaker.oscog-eem.794 Fix the shift for divide issues in the LargeIntegersPlugin. Add code to generateDivide:on:indent: to spit out checking asserts if required. Change the SmartSyntaxPluginCodeGenerator to generate code that ifdefs out the remapOop:in: rigmarole on Spur. Fix inline cache for Characters in Spur. Update mksistaarchives for new build structure on Mac (only). ------------------------------------------------------------------------ r3025 | eliot | 2014-07-01 20:12:22 -0700 (Tue, 01 Jul 2014) | 6 lines CogVM source as per VMMaker.oscog-eem.792 Fix potential liveness of ReceiverResultReg across counting jumps by reloading ReceiverResultReg before returning from the trampoline through which ceCounterTripped: is called. ------------------------------------------------------------------------ r3024 | eliot | 2014-07-01 08:39:10 -0700 (Tue, 01 Jul 2014) | 7 lines CogVM source as per VMMaker.oscog-eem.791 Rescue non-Spur builds by making accessorDepthForPrimitiveIndex: a Spur option. Factor out the type machinery in generateShiftRight:on:indent: and use it to ensure generateSignedBitShift:on:indent: will not cast 64-bit vars to ints. ------------------------------------------------------------------------ r3023 | eliot | 2014-06-30 20:09:58 -0700 (Mon, 30 Jun 2014) | 46 lines CogVM source as per VMMaker.oscog-eem.790 Implement following forwarders on primitive failure in machine code interpreter primitives (still have to implement this in sideways calls of named primitives). Allow the JIT to not compile primitiveDoNamedPrimitiveWithArgs to avoid any potential complications. Rewrite all the semaphore installing primitives to fail if the semaphore arg is neither a semaphore or nil instead of assuming if its not a semaphore it must be nil, so as to fail and retry when semaphores are forwarded (as they are when Semaphore is redefined). Implement isSemaphoreOop:/Obj: in the object memories to abstract away the code. Base Spur's on the class index of splObj: ClassSemaphore, avoiding the table lookup to derive the class. Make checkForEventsMayContextSwitch: treat all its semaphores consistently. Have spur's fetchClassOfNonImm: answer nilObj for forwarders to avoid assert fails. On Spur add read barriers to primitiveSuspend and synchronousSignal:'s myList access, because the process list manipulation routines do no checking. Add assert checks for forwarders in the process list manipulation routines. Fix slip in StackInterpreter>>actuallyFollowNecessaryForwardingInMethod:literalCount: that corrupts the methodClassAssociation. Abstract out the call machinery from compileTrampolineFor:numArgs:arg:arg:arg:- arg:saveRegs:pushLinkReg:resultReg: so it can be used by maybeCompileRetry:onPrimitiveFail: in implementing following forwarders on primitive failure in machine code, and the Open PIC miss call. Have bytecodePCFor:cogMethod:startBcpc: map any pc before the stackCheckOffset to the initialPC, which applies to primitives in progress. Fix assert fails in updateStateOfSpouseContextForFrame:WithSP: and elsewhere with forwarders. LargeIntegers Plugin: Fix a latent signed shift bug in cDigitSub:len:with:len:into: caused by VMMaker.oscog-eem.785's eliminating the divide-via-shift optimization. These changes allow Cog Spur to redefine Process and/or Semaphore and not hang. ------------------------------------------------------------------------ r3021 | eliot | 2014-06-28 19:57:50 -0700 (Sat, 28 Jun 2014) | 38 lines CogVM source as per VMMaker.oscog-eem.787 Fix mixup of old & young spaces in primitiveVMParameter, and comment some new parameters. Fix return types for positive[64/32]BitValueOf:. positive32BitValueOf: must answer a usqInt, positive64BitValueOf: must answer a usqLong. Use positiveMachineIntegerValueOf: to decode arg in primitiveNewWithArg and ensure positiveMachineIntegerValueOf: is inlined there-in. Spur: Fix sign and overflow issues in instantiating larger objects and determining the size of large instances. Fix some freeChunk accesses that used fetchPointer:ofObject:. Cog ARM: Fix prim return for compileInterpreterPrimitive: on RISCs. On return from interpreter prim, ret pc is in instructionPointer and must return to whence it came, which is the stack on CISC and the LinkReg on RISC. Hence restoring the receiver reg requires different offsets in the two cases. Rework the rotatable quick constant logic a little and clean up users. Fix concretizeMoveRXbrR to do byte not word loads. Fix concretizeConditionalJumpLong: to actually be conditional. Oops. Correct mistaken callersaved reg stuff for ARM Fix concretizedRetN to not over-bump the SP The method abort trampolines shouldn't pop anything, especially now we have the pushLinkreg: arg to manage the LinkReg more easily. Slang: Rip out the UseRightShiftForDivide optimization. It gets unsigned division wrong, and C compilers can and will optimize this correctly themselves. ------------------------------------------------------------------------ r3020 | eliot | 2014-06-26 10:29:55 -0700 (Thu, 26 Jun 2014) | 4 lines Set the IMAGE_FILE_LARGE_ADDRESS_AWARE flag in the image header of the Windows executables to allow e.g. Spur to allocate more than 2Gb. Set more ignore props on win build dirs. ------------------------------------------------------------------------ r3018 | eliot | 2014-06-25 14:06:58 -0700 (Wed, 25 Jun 2014) | 2 lines Fix snafu with the linux vm Makefile; got to add sqUnixSpurMemory.o to the mix. ------------------------------------------------------------------------ r3017 | eliot | 2014-06-25 13:21:12 -0700 (Wed, 25 Jun 2014) | 9 lines Rewrite Spur memory allocation on win32 similarly to unix. Can now allocate up to 1.8Gb on Windows XP (which has a 2Gb address space limit). Add a flag to indicate if the win32 exe is running as a console app and don't write to the in-window console if so. Fix a slip in sqWin32VMProfile.c. Fix access to revisionAsString in sqUnixHeartbeat.c (it should include sqSCCSVersion.h) and revert revisionAsString to static. ------------------------------------------------------------------------ r3016 | eliot | 2014-06-24 11:38:36 -0700 (Tue, 24 Jun 2014) | 3 lines Further simplification in platforms/unix/vm/sqUnixSpurMemory.c. Fix snafu in platforms/unix/vm/sqUnixMemory.c. ------------------------------------------------------------------------ r3015 | eliot | 2014-06-24 10:30:28 -0700 (Tue, 24 Jun 2014) | 7 lines Rewrite platforms/unix/vm/sqUnixSpurMemory.c to stand alone. Use it in place of sqMacMemory.c with Spur on Mac OS. Now Spur can grow the heap to 2.9Gb on both linux (CentOS 5.3) and Mac OS X (10.6.8). Move the nuking of version.o from the xcodeproj (where it was inoperative) to the makevm scripts. ------------------------------------------------------------------------ r3014 | eliot | 2014-06-23 19:18:33 -0700 (Mon, 23 Jun 2014) | 12 lines CogVM source as per VMMaker.oscog-eem.779 Rewrite memory allocation on linux for Spur. Arrange that the heap can grow above 2Gb without any large initial alloc. Fix some sign issues with free space tallying to allow Spur to shrink memory and answer via primitiveVMParameter heap sizes above 2Gb. Add longPrintInstancesOf:/longPrintInstancesWithClassIndex: for debugging. Add missing build scripts for itimer heartbeat Spur linux VMs. ------------------------------------------------------------------------ r3012 | eliot | 2014-06-21 21:00:44 -0700 (Sat, 21 Jun 2014) | 4 lines Make the spur image bootstrap work on linux. Suggest reading a suitable README when the linux VM fails to spawn the heartbeat thread. Make revisionAsString global to assist. ------------------------------------------------------------------------ r3007 | eliot | 2014-06-19 14:12:57 -0700 (Thu, 19 Jun 2014) | 9 lines CogVM source as per VMMaker.oscog-eem.776 Add parameter 54 on Spur to answer totalFreeOldSpace. Add checks to unix sqAllocateMemorySegmentOfSizeAboveAllocatedSizeInto to avoid MAP_FIXED wiping out existing mappings. Fix the odd syntax error in the image scripts. ------------------------------------------------------------------------ r3006 | eliot | 2014-06-16 13:42:34 -0700 (Mon, 16 Jun 2014) | 21 lines CogVM source as per VMMaker.oscog-eem.775 Rationalize the allocation check filler between V3 ObjMem and Spur. Make it applicable only to plugin prims and optional, via the checkAllocFiller flag. Add a prim failure code for this, PrimErrWritePastObject. Make the Cogit check and fail offending ext prims if the flag is set. Don't fill new space with the alloc check filler if the flag is not set. Provide a -checkpluginwrites command line flag to turn on the check. This is good for a -49% increase in the performance of e.g. [1 to: 1000000000 do: [:i| {nil}]] timeToRun on Spur. Refactor numStrongSlotsOf:ephemeronInactiveIf: and inline it in scavengeReferentsOf:, reusing the object format for the isWeakling test. Provide numStrongSlotsOfWeakling: for weakling nilling, and hence arrange that the numStrongSlotsOf:ephemeronInactiveIf: is always non-nil. Reorganize the StackInterpreter classes when building a VMMaker image. ------------------------------------------------------------------------ r3003 | eliot | 2014-06-15 15:17:31 -0700 (Sun, 15 Jun 2014) | 14 lines CogVM source as per VMMaker.oscog-eem.774 Remember to regenerate the cogit.c's from the non-Spur configurations to fix the "relocating call to invalid address" bug. Add a hack to link the XDisplayControlPlugin against vm-display-X11 (a hack because it assumes paths etc). But at least the plugin works. Add an initial revision of platforms/unix/plugins/UUIDPlugin/acinclude.m4 to search for uuid.h in either /usr/include/uuid or /usr/include and use either uuidgenerate or uuidgen. Spur: Restore accuracy of followForwarded:'s comment back to Igor's original. ------------------------------------------------------------------------ r3000 | eliot | 2014-06-13 20:36:18 -0700 (Fri, 13 Jun 2014) | 2 lines Add libcrypto and libssl to Linux and Mac builds of the SqueakSSL plugin. ------------------------------- Jul 15 2014 CogVM source as per VMMaker.oscog-eem.820/r3041 Put the handling of the cloning of cogged methods in the clone: implementations, removing it from the primitive. Add it to the pinning clone too. Specialize the store check trampoline generation. Move it down to the relevant object representations. Move setting of isRemembered flag to true into SpurGenerationScavenger>> remember:. Inline possibleRootStoreInto: (given that remember: is /not/ inlined. Call remember directly from the ceStoreCheckTrampoline, and hence have remember: answer its argument. Fix localization bug. Variables in initialize methods were not considered references (cuz there are excluded). This caused VMMaker.oscog-eem.816's extraction of zero/false vars to StackInterpreter>>#initialize to cause nextPollUsecs to be localized to checkForEventsMayContextSwitch:. Spur: Fix bug in scanClassPostBecome:effects: with new lazy selector following policy by... throwing it all away. The read barriers on method lookup (of the methodClass association in super sends, of the superclass link, of method dictionaries, method dictionary arrays, selectors and methods is cheap. So replace scanning classes and method dictionaries in the class table post become with read marriewrs on methodClass, superclass and method dictionary etc on lookup. The read barrier on an object from which we are going to fetch state (such as a class or method dictionary) is essentially free on modern machines because the class index and the state very likely share a cache line, and the register code for testing is so cheap compared to memory access. Further the read barrier on selectors is cheap because the method lookup cache is effective in reducing the number of message lookups and because nil entries need no check. So nuke all the followNecessaryForwardingInMethod: machinery including the cmUsesMethodClass hack. Nuke scanClassPostBecome:effects:. Rip out the forwardingCount: measurement code. It causes bad performance regressions (due to failing inlines?) Clean up, e.g. replace followNonImmediateField:ofObject: uses with followObjField:ofObject:. canPinObjects can be inlined. ------------------------------- Jul 9 2014 CogVM binaries as per VMMaker.oscog-eem.812/r3039 Spur: Fix bug in old space GC processing of weaklings. Old code didn't trace strong references in weaklings to weaklings in markWeaklingsAndMarkAndFireEphemerons. Make sure nilUnmarkedWeaklingSlotsIn: can be inlined. Bug shows up as crashes in Pharo Spur, Pharo making much more use of weakness than Squeak or Newspeak. Fix bug in nilUnmarkedWeaklingSlotsIn: which was not guarding the isForwarded: check with an isFreeObject: not check. Move the scanning for young references in scavenger processing of weaklings into processWeakSurvivor:. Fix minor slips in allObjects & allInstancesOf: which should only empty weaklingStack if marking. Fix a couple of storePointer:ofObject:'s being applied to objStacks. Rename isReallyForwarded: to isUnambiguouslyForwarder: and add an assert to isForwarded: to catch accidental applications to free objects. Fix longPrintOop: for free referents. V3 Cogit: Fix bogus assert fail due to signedness. Slang: Assign complex expressions to loop variables in value: expansions. Old code would replace variable with expansion of actual parameter everywhere. ------------------------------- Jul 7 2014 CogVM binaries as per VMMaker.oscog-eem.808/r3034 Change implementation of the implicit receiver trampoline to cache the class tag, not the class object (thanks Ryan). Has a significant impact on Newspeak Spur performance. Refactor getInlineCacheClassTagFrom:into: into genGetInlineCacheClassTagFrom:into:forEntry: and add inlineCacheTagForClass: to support this. Change the V3 inline cache check to not shift the compact class index (thanks Tim). Saves a byte and an instruction from the entry sequence on x86. Spur: Fix bug in allocateSlotsForPinningInOldSpace:bytes:format:classIndex: and make sure answered object is actually pinned. Hence fix pinObject: & primitivePin. Plugins: Fix access to the characterTable in the ThreadedFFIPlugins. Replace characterTable at: with characterObjectOf:, and in Spur support wide characters. Fix divide-as-shift issue in BalloonEnginePlugin. Internal/Sista changes: Issue a prefetch for Sista counters after frame build. Make genSmallIntegerComparison:orDoubleComparison: observe hasDoublePrecisionFloatingPointSupport (for ARM). Sort VM methods by class first, selector second, to group by coarse functionality (e.g. scavenger methods) for the benefit of the VMProfiler. Access missOffset via a macro (very minor speedup). Cogit: Change the management of counters in Sista methods to hold the counters well away from code. This restores the performance of the Sista VMs on x86, where when counters are close to code the writes to counters flush the instruction cache and destroy performance. In V3 use malloc to allocate and free them (this means Sista V3 is not currently simulable). In Spur, store them in oldSpace in pinned objects (hence Sista Spur simulates just fine). Split the enilopmarts into those that are used in the context of a call (entering code as if from a call) and those used in other contexts (converting interpreted method to machine code method in loops, returning from interpreter method to machine code one, etc). This fixes enilopmarts on RISCs (i.e. ARM) where the ret pc must be popped into the LinkReg when entering machine code as if from a call, but not when entering in other contexts (returning from inter- preter to machine-code method, converting interpreter method to machine code in a loop, etc). ------------------------------- Jul 2 2014 CogVM binaries as per VMMaker.oscog-eem.797/r3029 Fix mixup of old & young spaces in primitiveVMParameter, and comment some new parameters. Fix return types for positive[64/32]BitValueOf:. positive32BitValueOf: must answer a usqInt, positive64BitValueOf: must answer a usqLong. Use positiveMachineIntegerValueOf: to decode arg in primitiveNewWithArg and ensure positiveMachineIntegerValueOf: is inlined there-in. Spur: Fix inline cache for Characters in Spur. Existing code failed all machine code sends to characters, creating PICs at those send sites which failed back to the interpreter. Implement forwarder following on primitive failure for all primitive calls from machine code. Allow the JIT to not compile primitiveDoNamedPrimitiveWithArgs to avoid any potential complications. Have spur's fetchClassOfNonImm: answer nilObj for forwarders to avoid assert fails. Fix assert fails in updateStateOfSpouseContextForFrame:WithSP: and elsewhere with forwarders. Add read barriers to primitiveSuspend and synchronousSignal:'s myList access, because the process list manipulation routines do no checking. Add assert checks for forwarders in the process list manipulation routines. Fix StackInterpreter>>actuallyFollowNecessaryForwardingInMethod:literalCount: so that it no longer corrupts the methodClassAssociation. Rewrite all the semaphore installing primitives to fail if the semaphore arg is neither a semaphore or nil instead of assuming if its not a semaphore it must be nil, so as to fail and retry when semaphores are forwarded (as they are when Semaphore is redefined). Implement isSemaphoreOop:/Obj: in the object memories to abstract away the code. Base Spur's on the class index of splObj: ClassSemaphore, avoiding the table lookup to derive the class. Make checkForEventsMayContextSwitch: treat all its semaphores consistently. These changes allow Cog Spur to redefine Process and/or Semaphore and not hang. Fix sign and overflow issues in instantiating larger objects and determining the size of large instances. Fix some freeChunk accesses that used fetchPointer:ofObject:. Make sure forwarders have accurate slot counts, bumping it to 1 if it was zero. Plugins: Fix LargeIntegersPlugin>>isNormalized: for forwarders, no longer assuming that if its arg isn't a SmallInteger it must be a large integer. Squash an assert fail in lengthOf:format: on forwarders by using numSlotsOfAny:. Fix the shift for divide issues in the LargeIntegersPlugin. Change the SmartSyntaxPluginCodeGenerator to generate code that ifdefs out the remapOop:in: rigmarole on Spur. Windows: Set the IMAGE_FILE_LARGE_ADDRESS_AWARE flag in the image header of the Windows executables to allow e.g. Spur to allocate more than 2Gb. Slang: Rip out the UseRightShiftForDivide optimization. It gets unsigned division wrong, and C compilers can and will optimize this correctly themselves. ------------------------------- Jul 1 2014 CogVM source as per VMMaker.oscog-eem.791/r3024 Implement following forwarders on primitive failure in machine code interpreter primitives (still have to implement this in sideways calls of named primitives). Allow the JIT to not compile primitiveDoNamedPrimitiveWithArgs to avoid any potential complications. Rewrite all the semaphore installing primitives to fail if the semaphore arg is neither a semaphore or nil instead of assuming if its not a semaphore it must be nil, so as to fail and retry when semaphores are forwarded (as they are when Semaphore is redefined). Implement isSemaphoreOop:/Obj: in the object memories to abstract away the code. Base Spur's on the class index of splObj: ClassSemaphore, avoiding the table lookup to derive the class. Make checkForEventsMayContextSwitch: treat all its semaphores consistently. Have spur's fetchClassOfNonImm: answer nilObj for forwarders to avoid assert fails. On Spur add read barriers to primitiveSuspend and synchronousSignal:'s myList access, because the process list manipulation routines do no checking. Add assert checks for forwarders in the process list manipulation routines. Fix slip in StackInterpreter>>actuallyFollowNecessaryForwardingInMethod:literalCount: that corrupts the methodClassAssociation. Abstract out the call machinery from compileTrampolineFor:numArgs:arg:arg:arg:- arg:saveRegs:pushLinkReg:resultReg: so it can be used by maybeCompileRetry:onPrimitiveFail: in implementing following forwarders on primitive failure in machine code, and the Open PIC miss call. Have bytecodePCFor:cogMethod:startBcpc: map any pc before the stackCheckOffset to the initialPC, which applies to primitives in progress. Fix assert fails in updateStateOfSpouseContextForFrame:WithSP: and elsewhere with forwarders. LargeIntegers Plugin: Fix a latent signed shift bug in cDigitSub:len:with:len:into: caused by VMMaker.oscog-eem.785's eliminating the divide-via-shift optimization. These changes allow Cog Spur to redefine Process and/or Semaphore and not hang. ------------------------------- Jun 28 2014 CogVM source as per VMMaker.oscog-eem.787/r3021 Fix mixup of old & young spaces in primitiveVMParameter, and comment some new parameters. Fix return types for positive[64/32]BitValueOf:. positive32BitValueOf: must answer a usqInt, positive64BitValueOf: must answer a usqLong. Use positiveMachineIntegerValueOf: to decode arg in primitiveNewWithArg and ensure positiveMachineIntegerValueOf: is inlined there-in. win32: Set the IMAGE_FILE_LARGE_ADDRESS_AWARE flag in the image header of the Windows executables to allow e.g. Spur to allocate more than 2Gb. Spur: Fix sign and overflow issues in instantiating larger objects and determining the size of large instances. Fix some freeChunk accesses that used fetchPointer:ofObject:. Cog ARM: Fix prim return for compileInterpreterPrimitive: on RISCs. On return from interpreter prim, ret pc is in instructionPointer and must return to whence it came, which is the stack on CISC and the LinkReg on RISC. Hence restoring the receiver reg requires different offsets in the two cases. Rework the rotatable quick constant logic a little and clean up users. Fix concretizeMoveRXbrR to do byte not word loads. Fix concretizeConditionalJumpLong: to actually be conditional. Oops. Correct mistaken callersaved reg stuff for ARM Fix concretizedRetN to not over-bump the SP The method abort trampolines shouldn't pop anything, especially now we have the pushLinkreg: arg to manage the LinkReg more easily. Slang: Rip out the UseRightShiftForDivide optimization. It gets unsigned division wrong, and C compilers can and will optimize this correctly themselves. ------------------------------- Jun 28 2014 CogVM binaries as per VMMaker.oscog-eem.779/r3018. Rewrite memory allocation on linux for Spur. Arrange that the heap can grow above 2Gb without any large initial alloc. Rewrite platforms/unix/vm/sqUnixSpurMemory.c to stand alone. Use it in place of sqMacMemory.c with Spur on Mac OS. Now Spur can grow the heap to 2.9Gb on both linux (CentOS 5.3) and Mac OS X (10.6.8). Rewrite Spur memory allocation on win32 similarly to unix. Can now allocate up to 1.8Gb on Windows XP (which has a 2Gb address space limit). Add a flag to indicate if the win32 exe is running as a console app and don't write to the in-window console if so. Fix some sign issues with free space tallying to allow Spur to shrink memory and answer via primitiveVMParameter heap sizes above 2Gb. Add longPrintInstancesOf:/longPrintInstancesWithClassIndex: for debugging. Suggest reading a suitable README when the linux VM fails to spawn the heartbeat thread. Add parameter 54 on Spur to answer totalFreeOldSpace. ------------------------------- Jun 28 2014 CogVM source as per VMMaker.oscog-eem.775/r3006 Rationalize the allocation check filler between V3 ObjMem and Spur. Make it applicable only to plugin prims and optional, via the checkAllocFiller flag. Add a prim failure code for this, PrimErrWritePastObject. Make the Cogit check and fail offending ext prims if the flag is set. Don't fill new space with the alloc check filler if the flag is not set. Provide a -checkpluginwrites command line flag to turn on the check. This is good for a -49% increase in the performance of e.g. [1 to: 1000000000 do: [:i| {nil}]] timeToRun on Spur. Refactor numStrongSlotsOf:ephemeronInactiveIf: and inline it in scavengeReferentsOf:, reusing the object format for the isWeakling test. Provide numStrongSlotsOfWeakling: for weakling nilling, and hence arrange that the numStrongSlotsOf:ephemeronInactiveIf: is always non-nil. Reorganize the StackInterpreter classes when building a VMMaker image. ------------------------------- Jun 28 2014 CogVM source as per VMMaker.oscog-eem.774/r3003 Remember to regenerate the cogit.c's from the non-Spur configurations to fix the "relocating call to invalid address" bug. Add a hack to link the XDisplayControlPlugin against vm-display-X11 (a hack because it assumes paths etc). But at least the plugin works. Add an initial revision of platforms/unix/plugins/UUIDPlugin/acinclude.m4 to search for uuid.h in either /usr/include/uuid or /usr/include and use either uuidgenerate or uuidgen. ------------------------------- Jun 28 2014 CogVM binaries as per VMMaker.oscog-eem.772/r3000 Add libcrypto and libssl to Linux and Mac builds of the SqueakSSL plugin. Add the XDisplayControlPlugin and include it (external) and the AioPlugin (internal) to the linux x86 builds. Spur: Fix fillObj: signedness for objects straddling the midpoint of the address space (quickly affects linux). Similarly for routines in pigCompact, to get asserts correct. Fix printOopsFrom:to:. for objects up to endOfMemory. Declare lastFreeChunk and firstFreeChunk correctly. Fix numberOfForwarders: and printForwarders: for isForwarded:'s blindness towards freeChunks. Comment isForwarded: to be clear on the issue. Have the segment manager pass to sqAllocateMemorySegmentOfSize: the address of the first large enough gap in the address space, instead of the address of the end of the first segment. This allows e.g. linux to use MAP_FIXED and hence get past a 128Mb limit on mmapping. Fix bugs in isValidFreeObject: & printFreeTreeChunk: that caused bogus assert failures. Remember to include the Spur moniker in the -version output on linux Cog: Fix an abort (relocating call to invalid address) due to an over-zealous check in relocateCallBeforeReturnPC:by:. Since we relocate e.g. calls to primitives there can be no effective range check there-in. ------------------------------- Jun 28 2014 CogVM binaries as per VMMaker.oscog-eem.769/r2987 platform code: Type the memory allocators as accepting and answering usqInts to give a chance at allocating more than 2Gb. Use unsigned comparisons when testing if there's sufficient heap space in readImageFromFile:HeapSize:StartingAt:. Leads to accurate failure message when failing to allocate large heap. Correct syntax of inline assembler for byte and word swap operations in win32 16-bit displays. Thanks to Nicolai Hess! sqUnixX11.c: Include the right X11 include file to pull in the def for XK_equal (keysym.h vs keysymdef.h). Don't define BytesPerOop or BaseHeaderSize; these should be taken from interp.h. Integrate Philippe Back's fix for numberic keypads on X11; see https://pharo.fogbugz.com/f/cases/11352/. Add the XDisplayControlPlugin to the plugins loaded by the image build script. Use pthread_atfork to reinstall the heartbeat thread post fork. Make sure SIGALRM is unblocked if using the interval timer heartbeat on linux. Remove RPATH spec from unix builds. Include the executable name in the crash.dmp report. Fix stack backtrace printing on Mac & Unix to not segfault when invoked from error(char *msg) function. CoInterpreter: No longer inline CoInterpreter>>pre/postGCAction: for VM profiling. Fix instantiation of large non-byte objects. The old code for sufficientSpaceToInstantiate:indexableSize: stupidly subtracted BytesPerWord instead of ShiftForWord from LongSizeNumBits in determining the max size. Garbage collect/remap the primTraceLog correctly. If a GC happens very early in start-up the log circular buffer may not be full and the existing code assumed it always was. Add Spur-specific interpreter proxy functions for immediate character and pinning access. Implement the new Spur interpreterProxy API for ObjectMemory (of course pinObject: fails, and isCharacterValue: has a smaller range). Fix a bug in printing frame flags (order was the wrong way around). Allow vmParameterAt: 26 put: 0 to disable the heartbeat itimer. Add AioPlugin to Mac VMs. Make CoInterpreter>>printFrame: not mislead as to the number of temps in a block activation. Change the time primitives to access the time now, not the time as updated by the heartbeat (but /don't/ change the time basis for event checking. This for performance because use of gettimeofday in e.g. stack overflow can be a signficant performance overhead. Move the simulated time implementations (ioSeconds et al) up to StackInterpreter to reduce duplication. Add ioSecondsNow et al to the platform code to support this. Fix regression in primitiveUtcWithOffset, answer the correct local time offset. Fix ioSecondsNow to use the time now, not the heartbeat time. Revamp primitiveVMParameter to avoid overflow in values such as total heap size. Make statProcessSwitch, statIOProcessEvents, statForceInterruptCheck, statCheckForEvents, statStackOverflow & statStackPageDivorce 64-bit to avoid wrapping. Make sure that positive64BitIntegerFor: will not cause a GC just as positive32BitIntegerFor: doesn't. Make primitiveClone cope with variable args, cloning its last argument. This for the Newspeak VMMirror. Add primitiveAllObjects, adapted from VMMaker-dtl.339.mcz Add Nicolas Cellier's bitblt speedups, reference: Mantis issue 7802: Fast-up BitBlt rgbAdd rule Mantis issue 7803: Fast-up BitBlt alpha blending rules Check for valid bitmap in primitivePixelValueAt aka BitBltSimulation>> primitivePixelValueAtX:y: Fix provided by Nicolas Cellier. Reference Mantis 7799 Avoid including the instructionPointer in the context when marrying the top frame during divorceAllFrames for voidVMState..If:. (how did this ever work??) Fix a bug in BitBltPlugin>>lockSurfaces. Add the compression prims to the Newspeak VM's ZipPlugin. Fix roomToPushNArgs:; the Cog VMs can be more lenient because of the use of a stack instead of contexts. Fixes some valueWithArguments: failures. Integrate Nice's improved SmallInteger generated primitives that support int x float comparison, & hence speed-up int x float comparison enormously. Integrate VMMaker-dtl.328 (failure of primitiveDisplayString to advance destX). Integrate 2792, fix memory leaks in SqueakSSL on unix. Integrate VMMaker-tpr.325 7247: BitBlt Bug in alphaSourceBlendBits8. Integrate VMMaker-tpr.326, Fix a single-bit constant error in BitBltSimulation>>copyBits:Fallback: - change 16r3FFF to 16r7FFF to correct tallyIntoMap behaviour for Scratch using BenBlt on the Pi. Restrict at-cache to bytecodePrimAt[Put], eliminating it from primitive[String]At[Put]. Sionara the explicit noAtCache at:[put:] machinery in CoInterpreter now that the atCache is confined to the interpreter's special selector at:[put:] bytecodes. Speeds up Stack VM signfiicantly, e.g. a compile of Compiler package falls from 1.6s to 1.4s on 2.2GHz Intel Core i7 MacBook Pro. Restrict at-cache to bytecodePrimAt[Put], eliminating it from primitive[String] At[Put]. Undoes need for fix in VMMaker-oscog.44 of 7 January 2011: "Fix leaking of objects into the atCache due to ceSend:super:to:numArgs:'s use of executeNewMethod without always setting messageSelector." This renders messageSelector and lkupClass ephemeral, since they are live only during message lookup and because createActualMessageTo will not cause a GC these cannot change during message lookup. Hence eliminate them from markAndTraceInterpreterOops: & mapVMRegisters. Fix arg count for primPCREExecfromto Cogit: Streamline the genPrimReturnEnterCogCodeEnilopmart. Streamline genExternalizePointersForPrimitiveCall Have StackToRegisterMappingCogit>>genPushReceiverBytecode use ReceiverResultReg if it contains self. Fix a double free bug in unlinkSendsOf:isMNUSelector:. Harmless cuz the result is only a bogus count of how many methods freed. Beef up the cog method integrity check to verify a its methodObject is a CompiledMethod. Fix bug in unlinkSendsOf:isMNUSelector: (primitiveFlushCacheBySelector) where old code could free the method of an active frame. Fix an assert fail in mapFor:bcpc:performUntil:arg: (this for primitiveClass where the class table reference can be the first map entry). Fix a bug in Newspeak remapIfObjectRef:pc:hasYoung: with dynamic super sends which could compute an invalid target method. Rename pushExplicitOuterSendReceiverBytecode et al to pushExplicitOuterReceiverBytecode et al. These are not sends. Fix pc-mapping for NewspeakV4. Dynamic super sends should /not/ be annotated with IsNSSendCall, but wth the vanilla IsSendCall. This fixes a bug converting an interpreter activation of a method with a loop and a dynamic super send to a machine code frame. For performance, specify that mapFor:[bcpc:]performUntil:arg: are inlined, eliminating the perform/indirect function call. This adds of the order of 3% to the size of a cogit.o's text seg so is acceptable. Print (nil) next to the selector for cog methods with a nil selector. Fix bug in Cogit>>unlinkSendsOf:isMNUSelector:, used by primitiveFlushCacheBySelector. The method could leave sends linked to freed MNU PICs. Spur: Implement unforwarding in inlined machine code #==. Hence rewrite the /horrible/ StackToRegisterMappingCogit>>genSpecialSelectorEqualsEquals. Implement a peephole in the Spur Cogit for an indirection vector initialized with a single value Avoid initializing the slot in the array to nil and instead initialize it with the value. Refactor closure creation in the Cogit to move it into the object representations. In Spur allocate and initialize the closure inline. Refactor context creation in the Cogit, moving it to the object representations. In Spur allocate the context in one of four trampolines for block vs method and large vs small contexts. Refactor ceCreateNewArray & cePositive32BitInteger trampolines, moving them into the object representations as required (cePositive32BitInteger is only used in the Squeak obj rep). In the Spur obj rep inline allocation in pushNewArray bytecodes. Implement primitives to get (primitiveIsPinned) and (un)set (primitivePin) per-object pinning. Implement a simple policy to deal with the fact that typically heap growth happens during tenuring, not after a failed allocation. If, after scavenging, the heap has grown by some factor of its size at the previous global GC, do a global GC. Default the factor to 0.333333. Provide VM parameter access to this value: 55 ratio of growth and image size at or above which a GC will be performed post scavenge Implement memory shrinkage. Check free space around SpurSegmentManager prepareForSnapshot & postSnapshot. Now Spur's object representation implements genInnerPrimitiveMirrorNew[WithArg]: we need defaults for the generic object representation in the Newspeak Cog VM. Cogit: Refactor the code around pushing register arguments and switching between the Smalltalk and C stacks, moving the actual generators into backEnd (the special instance of the relevant CogAbstractInstruction subclass) . This allows CogARMInstruction to handle pushing the register args and hence handle the difference of having a link reg. Combine genSaveStackPointers & genLoadCStackPointers into Cogit>>genSmalltalkToCStackSwitch. Newspeak Spur: Provide machine code primitives for instantiateFixedClass: (a.k.a. a 1 arg basicNew) and instantiateVariableClass:withSize: (a.k.a. a 2 arg basicNew:). Slang: In non-production VMs add an attribute to disable register parameters (at least for GCC-compliant compilers), allowing all static functions to be called from gdb even in the -O1 assert VMs. Only output VM_LABELs in interpret. Change Stack VM builds to define VM_LABEL as null in the debug and assert VMs, restoring interpret's debuggability. Misc: Fix str:n:cmp: usage in printOopShortInner:. Spur: This readme has no need to cover all the changes made in the 9 month development of Spur up until this point. Instead here are the major changes over the design document, plus the class comment that describes the design. Implement "pig compact", a much more functional compaction algorithm that works by doubly-linking free chunks in address order, therefore allowing e.g. easy enumeration of the objects between the penultimate and ultimate free chunks. Hence the algorithm moves all the objects it can at the end of memory to free chunks at lower addresses. It is piggish for several reasons: 1. it is greedy, using parts of a free chunk, not looking for a best or perfect fit. 2. it is greedy trying to move a run of objects at a time 3. it deals with large objects ("pigs") by searching the free list for a free chunk large enough to hold the pig. (and what constitutes a pig remains to be tuned; currently it is 8 * the average object size. Write the totalFreeOldSpace to the image header immediately following the size of the first segment. This to allow better determination of how much free space to allocate on startup. Fix bug in shortPrintFrameAndCallers: (filter-out base frames). The design objectives for the Spur memory manager are - efficient object representation a la Eliot Miranda's VisualWorks 64-bit object representation that uses a 64-bit header, eliminating direct class references so that all objects refer to their classes indirectly. Instead the header contains a constant class index, in a field smaller than a full pointer, These class indices are used in inline and first-level method caches, hence they do not have to be updated on GC (although they do have to be traced to be able to GC classes). Classes are held in a sparse strong table. The class table needs only to be indexed by an instance's class index in class hierarchy search, in the class primitive, and in tracing live objects in the heap. The additional header space is allocated to a much expanded identity hash field, reducing hash efficiency problems in identity collections due to the extremely small (11 bit) hash field in the old Squeak GC. The identity hash field is also a key element of the class index scheme. A class's identity hash is its index into the class table, so to create an instance of a class one merely copies its identity hash into the class index field of the new instance. This implies that when classes gain their identity hash they are entered into the class table and their identity hash is that of a previously unused index in the table. It also implies that there is a maximum number of classes in the table. The classIndex field could be as narrow as 16 bits (for efficient access); at least for a few years 64k classes should be enough. But currently we make it the same as the identityHash field, 22 bits, or 4M values. A class is entered into the class table in the following operations: behaviorHash adoptInstance instantiate become (i.e. if an old class becomes a new class) if target class field's = to original's id hash and replacement's id hash is zero enter replacement in class table behaviorHash is a special version of identityHash that must be implemented in the image by any object that can function as a class (i.e. Behavior). - more immediate classes. An immediate Character class would speed up String accessing, especially for WideString, since no instatiation needs to be done on at:put: and no dereference need be done on at:. In a 32-bit system tag checking is complex since it is thought important to retain 31-bit SmallIntegers. Hence, as in current Squeak, the least significant bit set implies a SmallInteger, but Characters would likely have a tag pattern of xxx10. Hence masking with 11 results in two values for SmallInteger, xxx01 and xxx11 (for details see In-line cache probe for immediates below). 30-bit characters are more than adequate for Unicode. In a 64-bit system we can use the full three bits and usefully implement an immediate Float. As in VisualWorks a functional representation takes three bits away from the exponent. Rotating to put the sign bit in the least significant non-tag bit makes expanding and contracting the 8-bit exponent to the 11-bit IEEE double exponent easy and makes comparing negative and positive zero easier (an immediate Float is zero if its unsigned 64-bits are < 16). So the representation looks like | 8 bit exponent | 52 bit mantissa | sign bit | 3 tag bits | For details see "60-bit immediate Floats" below. - efficient scavenging. The current Squeak GC uses a slow pointer-reversal collector that writes every field in live objects three times in each collection, twice in the pointer-reversing heap traversal to mark live objects and once to update the pointer to its new location. A scavenger writes every field of live data twice in each collection, once as it does a block copy of the object when copying to to space, once as it traverses the live pointers in the to space objects. Of course the block copy is a relatively cheap write. - lazy become. The JIT's use of inline cacheing provides a cheap way of avoiding scanning the heap as part of a become (which is the simple approach to implementing become in a system with direct pointers). A becomeForward: on a (set of) non-zero-sized object(s) turns the object into a "corpse" or "forwarding object" whose first (non-header) word/slot is replaced by a pointer to the target of the becomeForward:. The corpse's class index is set to one that identifies corpses (let's say classIndex 1), and, because it is a special, hidden class index, will always fail an inline cache test. The inline cache failure code is then responsible for following the forwarding pointer chain (these are Iliffe vectors :) ) and resolving to the actual target. (In the interpreter there needs to be a similar check when probing the method cache). It has yet to be determined exactly how this is done (e.g. change the receiver register and/or stack contents and retry the send, perhaps scanning the current activation). See become read barrier below on how we deal with becomes on objects with named inst vars. We insist that objects are at least 16 bytes in size (see 8-byte alignment below) so that there will always be space for a forwarding pointer. Since none of the immediate classes can have non-immediate instances and since we allocate the immediate class indices corresponding to their tag pattern (SmallInteger = 1 & 3, Character = 2, SmallFloat = 5?) we can use all the class indices from 0 to 7 for special uses, 0 = free, and e.g. 1 = isForwarded. In general what's going on here is the implemention of a partial read barrier. Certain operations require a read barrier to ensure access of the target of the forwarding corpse, not the corpse itself. Read barriers stink (have poor performance), so we must restrict the read barrier to as few places as possible. See become read barrier below. See http://www.mirandabanda.org/cogblog/2013/09/13/lazy-become-and-a-partial-read-barrier/ & http://www.mirandabanda.org/cogblog/2014/02/08/primitives-and-the-partial-read-barrier/. - pinning. To support a robust and easy-to-use FFI the memory manager must support temporary pinning where individual objects can be prevented from being moved by the GC for as long as required, either by being one of an in-progress FFI call's arguments, or by having pinning asserted by a primitive (allowing objects to be passed to external code that retains a reference to the object after returning). Pinning probably implies a per-object "is-pinned" bit in the object header. Pinning will be done via lazy become; i..e an object in new space will be becommed into a pinned object in old space. We will only support pinning in old space. - efficient old space collection. An incremental collector (a la Dijkstra's three colour algorithm) collects old space, e.g. via an amount of tracing being hung off scavenges and/or old space allocations at an adaptive rate that keeps full garbage collections to a minimum. It may also be possible to provide cheap compaction by using lazy become: and best-fit (see free space/free list below). - 8-byte alignment. It is advantageous for the FFI, for floating-point access, for object movement and for 32/64-bit compatibility to keep object sizes in units of 8 bytes. For the FFI, 8-byte alignment means passing objects to code that expects that requirement (such as modern x86 numeric processing instructions). This implies that - the starts of all spaces are aligned on 8-byte boundaries - object allocation rounds up the requested size to a multiple of 8 bytes - the overflow size field is also 8 bytes We shall probably keep the minimum object size at 16 bytes so that there is always room for a forwarding pointer. But this implies either that we round object lengths up to units of 16 bytes (current choice) or that we need to implement an 8-byte filler to fill holes between objects > 16 bytes whose length mod 16 bytes is 8 bytes and following pinned objects. We can do this using a special class index, e.g. 1, so that the method that answers the size of an object looks like, e.g. chunkSizeOf: oop ^object classIndex = 1 ifTrue: [BaseHeaderSize] ifFalse: [BaseHeaderSize + (object slotSize = OverflowSlotSize ifTrue: [OverflowSizeBytes] ifFalse: [0]) + (object slotSize * BytesPerSlot)] chunkStartOf: oop ^(self cCoerceSimple: oop to: #'char *') - ((object classIndex = 1 or: [object slotSize ~= OverflowSlotSize]) ifTrue: [0] ifFalse: [OverflowSizeBytes]) Note that the size field of an object (its slot size) reflects the logical size of the object e.g. 0-element array => 0 slot size, 1-element array => 1 slot size). The memory manager rounds up the slot size as appropriate, e.g. (self roundUp: (self slotSizeOf: obj) * 4 to: 8) min: 8. Heap growth and shrinkage will be handled by allocating and deallocating heap segments from/to the OS via e.g. memory-mapping. This technique allows space to be released back to the OS by unmapping empty segments. See "Segmented Old Space" below). The basic approach is to use a fixed size new space and a growable old space. The new space is a classic three-space nursery a la Ungar's Generation Scavenging, a large eden for new objects and two smaller survivor spaces that exchange roles on each collection, one being the to space to which surviving objects are copied, the other being the from space of the survivors of the previous collection, i.e. the previous to space. (This basic algorithm is extended to handle weak arrays and ephemerons). To provide apparent pinning in new space we rely on lazy become. Since most pinned objects will be byte data and these do not require stack zone activation scanning, the overhead is simply an old space allocation and corpsing. To provide pinning in old space, large objects are implicitly pinned (because it is expensive to move large objects and, because they are both large and relatively rare, they contribute little to overall fragmentation - as in aggregates, small objects can be used to fill-in the spaces between karge objects). Hence, objects above a particular size are automatically allocated in old space, rather than new space. Small objects are pinned as per objects in new space, by asserting the pin bit, which will be set automaticaly when allocating a large object. As a last resort, or by programmer control (the fullGC primitive) old space is collected via mark-sweep (mark-compact) and so the mark phase must build the list of pinned objects around which the sweep/compact phase must carefully step. Free space in old space is organized by a number of free lists and a free tree . There are 32 or 64 free lists, depending on word size, indices 1 through wordSize - 1 holding blocks of space of the index * allocationUnit, index 0 holding a semi-balanced ordered tree of free blocks, each node being the head of the list of free blocks of that size. At the start of the mark phase the free list is thrown away and the sweep phase coalesces free space and steps over pinned objects as it proceeds. We can reuse the forwarding pointer compaction scheme used in the old collector. Incremental collections merely move unmarked objects to the free lists (as well as nilling weak pointers in weak arrays and scheduling them for finalization). The occupancy of the free lists is represented by a bitmap in an integer so that an allocation of size wordSize - 1 or less can know whether there exists a free chunk of that size, but more importantly can know whether a free chunk larger than it exists in the fixed size free lists without having to search all larger free list heads. Incremental Old Space Collection The incremental collector (a la Dijkstra's three colour algorithm) collects old space via an amount of tracing being hung off scavenges and/or old space allocations at an adaptive rate that keeps full garbage collections to a minimum. [N.B. Not sure how to do this yet. The incremental collector needs to complete a pass often enough to reclaim objects, but infrequent enough not to waste time. So some form of feedback should work. In VisualWorks tracing is broken into quanta or work where image-level code determines the size of a quantum based on how fast the machine is, and how big the heap is. This code could easily live in the VM, controllable through vmParameterAt:put:. An alternative would be to use the heartbeat to bound quanta by time. But in any case some amount of incremental collection would be done on old space allocation and scavenging, the ammount being chosen to keep pause times acceptably short, and at a rate to reclaim old space before a full GC is required, i.e. at a rate proportional to the growth in old space]. The incemental collector is a state machine, being either marking, nilling weak pointers, or freeing. If nilling weak pointers is not done atomically then there must be a read barrier in weak array at: so that reading from an old space weak array that is holding stale un-nilled references to unmarked objects. Tricks such as including the weak bit in bounds calculations can make this cheap for non-weak arrays. Alternatively nilling weak pointers can be made an atomic part of incremental collection, which can be made cheaper by maintaining the set of weak arrays (e.g. on a list). Note that the incremental collector also follows (and eliminates) forwarding pointers as it scans. The incremental collector implies a more complex write barrier. Objects are of three colours, black, having been scanned, grey, being scanned, and white, unreached. A mark stack holds the grey objects. If the incremental collector is marking and an unmarked white object is stored into a black object then the stored object must become grey, being added to the mark stack. So the wrte barrier is essentially target isYoung ifFalse: [newValue isYoung ifTrue: [target isInRememberedSet ifFalse: [target addToRememberedSet]] "target now refers to a young object; it is a root for scavenges" ifFalse: [(target isBlack and: [igc marking and: [newValue isWhite]]) ifTrue: [newValue beGrey]]] "add newValue to IGC's markStack for subsequent scanning" The incremental collector does not detect already marked objects all of whose references have been overwritten by other stores (e.g. in the above if newValue overwrites the sole remaining reference to a marked object). So the incremental collector only guarantees to collect all garbage created in cycle N at the end of cycle N + 1. The cost is hence slightly worse memory density but the benefit, provided the IGC works hard enough, is the elimination of long pauses due to full garbage collections, which become actions of last resort or programmer desire. Incremental Best-Fit Compaction The free list also looks like it allows efficient incremental compaction. Currently in the 32-bit implementation, but easily affordable in the 64-bit implementation, objects have at least two fields, the first one being a forwarding pointer, the second one rounding up to 8-byte object alignment. On the free list the first field is used to link the list of free chunks of a given size. The second field could be used to link free chunks in memory order. And of course the last free chunk is the chunk before the last run of non-free objects. We compact by a) keeping each free list in memory order (including the lists of free chunks off each node in the large free chunk tree) b) sorting the free chunks in memory order by merge sorting the free lists c) climbing the free list in memory order. For each free chunk in the free list search memory from the last free chunk to the end (and from the previous chunk to the next chunk, and so on) looking for a best-fit live object. That object is then copied into the free chunk, and its corpse is turned into a forwarding pointer. This works because the compaction does not slide objects, and hence no forwarding blocks are necessary and the algorithm can be made incremental. Various optimizations are possible, e.g. using a bitmap to record the sizes of the first few free chunks on the list when looking for best fits. The assumptions being a) the number fo objects on the free list is kept small because the IGC incrementally compacts, and so sorting and searching the list is not expensive b) the incremental collector's following of forwarding pointers reclaims the corpses at the end of memory at a sufficient rate to keep the free list small c) the rounding of objects to an 8-byte alignment improves the chances of finding a best fit. Note that this incremental collection is easily amenable to leave pinned objects where they are; they are simply filtered out when looking for a best fit. Segmented Old Space A segmented oldSpace is useful. It allows growing oldSpace incrementally, adding a segment at a time, and freeing empty segments. But such a scheme is likely to introduces complexity in object enumeration, and compaction (enumeration would apear to require visiting each segment, compaction must be wthin a segment, etc). One idea that might fly to allow a segmented old space that appears to be a single contiguous spece is to use fake pinned objects to bridge the gaps between segments. The last two words of each segment can be used to hold the header of a pinned object whose size is the distance to the next segment. The pinned object's classIndex can be one of the puns so that it doesn't show up in allInstances; this can perhaps also indicate to the incremental collector that it is not to reclaim the object, etc. However, free objects would need a large enough size field to stretch across large gaps in the address space. The current design limits the overflow size field to a 32-bit slot count, which wouldn't be large enough in a 64-bit implementation. The overflow size field is at most 7 bytes since the overflow size word also contains a maxed-out 8-bit slot count (for object parsing). A byte can be stolen from e.g. the identityHash field to beef up the size to a full 64-bits. Lazy become & the become read barrier (see http://www.mirandabanda.org/cogblog/2013/09/13/lazy-become-and-a-partial-read-barrier/ & http://www.mirandabanda.org/cogblog/2014/02/08/primitives-and-the-partial-read-barrier/). As described earlier the basic idea behind lazy become is to use corpses (forwarding objects) that are followed lazily during GC and inline cache miss. However, a lazy scheme would appear to require a read barrier to avoid accessing the coirpse and mak sure wel follow the forwarding pointer. Without hardware support read barriers have poor performance, so we must restrict the read barrier as much as possible. The main goal is to avoid having to scan all of the heap to fix up pointers, as is done with ObjectMemory. We're happy to do some scanning of a small subset oif the heap, but become: cannot scale to large heaps if it must scan the entire heap. Objects with named inst vars and CompiledMethods are accessed extensively by the interpreter and jitted code. We must avoid as much checking of such accesses as possible; We judge an explicit read barrier on all accesses far too expensive. The accesses the VM makes which notionally require a read barrier are: - inst vars of thisContext, including stack contents (the receiver of a message and its arguments), which in Cog are the fields of the current stack frame, and the sender chain during (possibly non-local) return - named inst vars of the receiver - literals of the current method, in particular variable bindings (a.k.a. literal variables which are global associations), including the methodClass association. - in primitives, the receiver and arguments, including possible sub-structure. We have already discussed that we will catch an attempt to create a new activation on a forwarded object therough method lookup failing for forwarded objects. This would occur when e.g. some object referenced by the receiver via its inst vars is pushed on the stack as a message receiver, or e.g. answered as the result of some primtiive which accesses object state such as primtive at: So there is no need for a read barrier when accessing a new receiver or returning its state. But there must presumably be a read barrier in primitives that inspect that sub-state. However, we can easily avoid read barriers in direct literal access, and class hierarchy walking and message dictionary searching in message lookup. Whenever the become operation becomes one or more pointer objects (and it can e.g. cheaply know if it has becommed a CompiledMethod) both the class table and the stack zone are scanned. In the class table we can follow the forwarding pointers in all classes in the table, and we can follow their superclasses. But we would like to avoid scanning classes many times. Any superclass that has a non-zero hash must be in the table and will be encountered during the scan of the class table. Any superclass with a zero hash can be entered into the table at a point subsequent to the current index and hence scanned. The class scan can also scan method dictionary selectors and follow the methiod dictionary's array reference (so that dictionaries are valid for lookup) and scan the dictionary's method array iff a CompiledMethod was becommed. (Note that support for object-as-method implies an explicit read barrier in primitiveObjectAsMethod, which is a small overhead there-in). Accessing possibly forwarded method literals is fine; these forwarding objects will be pushed on the stack and caught either by the message lookup trap or explicitly in primitives that access arguments. However, push/store/pop literal variable cannot avoid an explicit check without requiring we scan all methods, which will be far too expensive. To avoid a check on super send when accessing a method's method class association, we must check the method class associations of any method in the stack zone, and in the method of any context faulted into the stack zone on return. We avoid a read barrier on access to receiver inst vars by scanning the stack zone and following pointers to the receiver. Amd of course, all of the scavenger, the incremental scan-mark-compactor and the global garbage collector follow and eliminate forwarding pointers as part of their object graph traversals. This means explicit read barriers in - push/store/pop literal variable - return (accessing the sender context, its inst vars, and the method class association of its method) - primitives that inspect the class and/or state of their arguments, excepting immediates. e.g. in at:put: (almost) no checks are required because the receiver will have been caught by the message send trap, the index is a SmallInteger and the argument is either an immediate Character (in String>>at:put:) or a possibly forwarded object stored into an array; i.e. the argument's state is inspected only if it is an immediate (the exception is 64-bit indexable and 32-bit indexable floats & bitmaps which could take LargeIntegers whose contents are copied into the relevant field). But e.g. in beCursor extensive checks are required because the primitive inspects a couple of form instances, and a point that are sub-state of the Cursor object. One approach would be an explicit call in the primitive, made convenient via providing something like ensureNoForwardingPointersIn: obj toDepth: n, which in beCursor's case would look like interpreter ensureNoForwardingPointersIn: cursorObj toDepth: 3 (the offset point of the mask form). Another approach would be to put an explicit read barrier in store/fetchPointer:ofObject:[withValue:] et al, but provide an additional api (e.g. store/fetchPointer:ofNonForwardedObject:[withValue:] et al) and use it in the VM's internals. The former approach is error-prone, but the latter approach is potentially ugly, touching nearly all of the core VM code. It would appear that one of these two approaches must be chosen. 61-bit immediate Floats Representation for immediate doubles, only used in the 64-bit implementation. Immediate doubles have the same 52 bit mantissa as IEEE double-precision floating-point, but only have 8 bits of exponent. So they occupy just less than the middle 1/8th of the double range. They overlap the normal single-precision floats which also have 8 bit exponents, but exclude the single-precision denormals (exponent-127) and the single-precsion NaNs (exponent +127). +/- zero is just a pair of values with both exponent and mantissa 0. So the non-zero immediate doubles range from +/- 0x3800,0000,0000,0001 / 5.8774717541114d-39 to +/- 0x47ff,ffff,ffff,ffff / 6.8056473384188d+38 The encoded tagged form has the sign bit moved to the least significant bit, which allows for faster encode/decode because offsetting the exponent can't overflow into the sign bit and because testing for +/- 0 is an unsigned compare for <= 0xf: msb lsb [8 exponent subset bits][52 mantissa bits ][1 sign bit][3 tag bits] So assuming the tag is 5, the tagged non-zero bit patterns are 0x0000,0000,0000,001[d/5] to 0xffff,ffff,ffff,fff[d/5] and +/- 0d is 0x0000,0000,0000,000[d/5] Encode/decode of non-zero values in machine code looks like: msb lsb Decode: [ 8expsubset ][ 52mantissa ][1s][3tags] shift away tags: [ 000 ][ 8expsubset ][ 52mantissa ][1s] add exponent offset: [ 11 exponent ][52mantissa][1s] rot sign: [1s][ 11 exponent ][52mantissa] Encode: [1s][ 11 exponent ][52mantissa] rot sign: [ 11 exponent ][52mantissa][1s] sub exponent offset: [ 000 ][ 8expsubset ][ 52 mantissa][1s] shift: [ 8expsubset ][ 52 mantissa][1s][ 000 ] or/add tags: [ 8expsubset ][ 52mantissa ][1s][3tags] but is slower in C because a) there is no rotate, and b) raw conversion between double and quadword must (at least in the source) move bits through memory ( quadword = *(q64 *)&doubleVariable). Heap Walking In heap walking the memory manager needs to be able to detect the start of the next object. This is complicated by the short and long header formats, short being for objects with 254 slots or less, long being for objects with 255 slots or more. Since an object that has an overflow header must have 255 as its header slot count we can use this as the marker. The overflow header word also has a numSlots field, set to 255. The remainder of the overflow size field is used for the object's slot size, the least significant word in 32-bits (for 2^34 bytes, more than the address space), the remaining 56 bits in 64-bits (for 2^59 bytes, which we hope is big enough for bridge objects). So if the word following an object contains 255 in its numSlots field, it must be the overflow size word of an object with a double header, and the word after that is the header, also with a saturated numSlots field. Total Number of Classes and Instance-specific Behaviours While the class index header field has advantages (saving significant header space, especially in 64-bits, providing a non-moving cache tag for inline caches, small constants for instantiating well-known classes instead of having to fetch them from a table such as the specialObjectsArray) it has the downside of limiting the number of classes. For Smalltalk programs 2^20 to 2^24 classes is adequate for some time to come, but for prototype languages such as JavaScript this is clearly inadequate, and we woud like to support the ability to host prototype languages within Squeak. There is a solution in the form of "auto-instances", an idea of Claus Gittinger's. The idea is to represent prototypes as behaviors that are instances of themselves. In a classical Smalltalk system a Behavior is an object with the minimal amount of state to function as a class, and in Smalltalk-80 this state is the three instance variables of Behavior, superclass, methodDict and format, which are the only fields in a Behavior that are known to the virtual machine. A prototype can therefore have its own behavior and inherit from other prototypes or classes, and have sub-prototypes derived from it if a) its first three instance variables are also superclass, methodDict, and format, and b) it is an instance of itself (one can create such objects in a normal Smalltalk system by creating an Array with the desired layout and using a change class primitive to change the class of the Array to itself). The same effect can be achieved in a VM with class indexes by reserving one class index to indicate that the object is an instance of itself, hence not requiring the object be entered into the class table and in the code that derives the class of an object, requiring one simple test answering the object itself instead of indexing the class table. There would probably need to be an auto-instantiation primitive that takes a behavior (or prototype) and an instance variable count and answers a new auto-instance with as many instance variables as the sum of the behavior (or prototype) and the instance variable count. Using this scheme there can be as many auto-instances as available address space allows while retaining the benefits of class indices. This scheme has obvious implications for the inline cache since all prototypes end up having the same inline cache tag. Either the inline cache check checks for the auto-instance class tag and substitutes the receiver, or the cacheing machinery refuses to add the auto-instance class tag to any inline cache and failure path code checks for the special case. Note that in V8 failing monomorphic sends are patched to open PICs (megamorphic sends); V8 does not use closed PICs due to the rationale that polymorphism is high in JavaScript. Miscellanea In-line cache probe for immediates We would like to keep 31-bit SmallIntegers for the 32-bit system. Lots of code could be impacted by a change to 30-bit SmallIntegers. If so, then isImmediate: oop ^(oop bitAnd: 3) ~= 0 isSmallInteger: oop ^(oop bitAnd: 1) ~= 0 isCharacter: oop ^(oop bitAnd: 2) = 2 If the in-line cache contains 0 for Characters then the in-line cache code (in x86 machine code) reads as follows, for a branch-free common-case: Limm: andl $0x1, %eax j Lcmp nops Lentry: movl %edx, %eax andl $0x3, %eax jnz Limm movl 0(%edx), %eax andl $0x3fffff, %eax Lcmp: cmpl %ecx, %eax jnz Lmiss 64-Bit sizes: We extend the original Squeak 4-bit format field to 5 bits, providing space for 3 odd bits for byte objects (2 for short objects & 1 for 32-bit long objects). Object sizes are slots, and byte (and short and 32-bit) lengths are computed by subtracting the odd bits from the shifted slot length. Keeping the format field saves bits because it subsumes the isWeak,isEphemeron,isPointer bits that would be necessary otherwise. The format field is organized as follows: 0 = 0 sized objects (UndefinedObject True False et al) 1 = non-indexable objects with inst vars (Point et al) 2 = indexable objects with no inst vars (Array et al) 3 = indexable objects with inst vars (MethodContext AdditionalMethodState et al) 4 = weak indexable objects with inst vars (WeakArray et al) 5 = weak non-indexable objects with inst vars (ephemerons) (Ephemeron) 6 unused, reserved for exotic pointer objects? e.g. contexts? 7 Forwarded Object, 1st field is pointer, rest of fields are ignored 8 unused, reserved for exotic non-pointer objects? 9 64-bit indexable 10 - 11 32-bit indexable (11 unused in 32 bits) 12 - 15 16-bit indexable (14 & 15 unused in 32-bits) 16 - 23 byte indexable (20-23 unused in 32-bits) 24 - 31 compiled method (28-21 unused in 32-bits) ------------------------------- Jun 28 2014 Newspeak Cog VM binaries as per VMMaker.oscog-eem.335/r2778 (highlights marked by *) Fix snafu in unpairedMethodList maintennance during code compaction. Must only examine CMMethods. Fix snafu in CogMethodZone>>removeFromUnpairedMethodList: which can create an infinite loop in code compaction. * Speed-up Newspeak significantly (e.g. -28% in one compile-intensive benchmark with 4m code zone) by maintaining unpaired methods (compilations of anonymous accessors) on a linked list instead of searching the entire method zone. Do this by adding a NewspeakCogMethod and surrogates that add a nextMethod field. Update surrogate accessor generation so that accessors can answer and receive surrogates and nils. Also add a nop before the dynSuperEntry so as to change its alignment now that (in the Newspeak VM) CogMethod has changed size in gaining a field. Eliminate CogStackPage, collapsing it onto InterpreterStackPage (it only has a class side). Use temporaryCountOfMethodHader: instead of tempCountOf: in SimpleStackBasedCogit>>compileFrameBuild (header is in hand). Set the inBlock variable in scanBlock and scanMethod. Change the argument to the needsFrameFunction to be the stack delta. Change genSpecialSelectorClass's needsFrameFunction to needsFrameIfStackGreaterThanOne: which better handles e.g. TextColor>>dominates: other ^self class == other class than needsFrameIfFollowsSend:. Fix StackToRegisterMappingCogit>>genPrimitiveClass for case where numArgs > 0 (must use other than ReceiverResultReg, e.g. objectClass:). Improve register usage in genSpecialSelectorClass. Save & restore methodOrBlockNumTemps in compileBlockBodies for symmetry with needsFrame & methodOrBlockNumArgs. Slang: fix formatting of WhileForeverBreakIf loops. Tweak startPCOrNilOfLiteral:in: to filter-out arrays from its check for closure literals. Tweak frameless inst var store to avoid a register copy from Arg0Reg. * Make primitiveObjectAtPut fail if changing the header word and the new header has a different literal count. Avoids crashing the VM when inadvertently changing the header, as a Newspeak bootstrap did recently. * Implement frameless inst var store from arguments, so e.g. Point>>setX:Y: is frameless. * Add Cogit support for clean blocks by scanning literals looking for BlockClosures on the current method. Add a prevBCDescriptor to the StackToRegisterMappingCogit scanners. This enables needsFrameIfFollowsSend:. Avoid spilling in frameless methods (now that pushTempVar can be frameless and hence inst var setters such at setX:Y: are now frameless) by adding needsFrameIfFollowsSend: used for special selectors #class & #==. Merge 2763. -memory and SQUEAK_MEMORY handle more than 31 bits of extra memory on 64-bit architectures. ------------------------------- Jun 28 2014 Newspeak Cog VM binaries as per VMMaker.oscog-eem.334/r2777 (highlights marked by *) Fix snafu in CogMethodZone>>removeFromUnpairedMethodList: which can create an infinite loop in code compaction. * Speed-up Newspeak significantly (e.g. -28% in one compile-intensive benchmark) by maintaining unpaired methods (compilations of anonymous accessors) on a linked list instead of searching the entire method zone. Do this by adding a NewspeakCogMethod and surrogates that add a nextMethod field. Update surrogate accessor generation so that accessors can answer and receive surrogates and nils. Also add a nop before the dynSuperEntry so as to change its alignment now that (in the Newspeak VM) CogMethod has changed size in gaining a field. Eliminate CogStackPage, collapsing it onto InterpreterStackPage (it only has a class side). Use temporaryCountOfMethodHader: instead of tempCountOf: in SimpleStackBasedCogit>>compileFrameBuild (header is in hand). Set the inBlock variable in scanBlock and scanMethod. Change the argument to the needsFrameFunction to be the stack delta. Change genSpecialSelectorClass's needsFrameFunction to needsFrameIfStackGreaterThanOne: which better handles e.g. TextColor>>dominates: other ^self class == other class than needsFrameIfFollowsSend:. Fix StackToRegisterMappingCogit>>genPrimitiveClass for case where numArgs > 0 (must use other than ReceiverResultReg, e.g. objectClass:). Improve register usage in genSpecialSelectorClass. Save & restore methodOrBlockNumTemps in compileBlockBodies for symmetry with needsFrame & methodOrBlockNumArgs. Slang: fix formatting of WhileForeverBreakIf loops. Tweak startPCOrNilOfLiteral:in: to filter-out arrays from its check for closure literals. Tweak frameless inst var store to avoid a register copy from Arg0Reg. * Make primitiveObjectAtPut fail if changing the header word and the new header has a different literal count. Avoids crashing the VM when inadvertently changing the header, as a Newspeak bootstrap did recently. * Implement frameless inst var store from arguments, so e.g. Point>>setX:Y: is frameless. * Add Cogit support for clean blocks by scanning literals looking for BlockClosures on the current method. Add a prevBCDescriptor to the StackToRegisterMappingCogit scanners. This enables needsFrameIfFollowsSend:. Avoid spilling in frameless methods (now that pushTempVar can be frameless and hence inst var setters such at setX:Y: are now frameless) by adding needsFrameIfFollowsSend: used for special selectors #class & #==. Merge 2763. -memory and SQUEAK_MEMORY handle more than 31 bits of extra memory on 64-bit architectures. ------------------------------- Jun 28 2014 CogVM binaries as per VMMaker.oscog-eem.333/r2776 (highlights marked by *) * Speed-up Newspeak significantly (e.g. -28% in one compile-intensive benchmark) by maintaining unpaired methods (compilations of anonymous accessors) on a linked list instead of searching the entire method zone. Do this by adding a NewspeakCogMethod and surrogates that add a nextMethod field. Update surrogate accessor generation so that accessors can answer and receive surrogates and nils. Also add a nop before the dynSuperEntry so as to change its alignment now that (in the Newspeak VM) CogMethod has changed size in gaining a field. Eliminate CogStackPage, collapsing it onto InterpreterStackPage (it only has a class side). Use temporaryCountOfMethodHader: instead of tempCountOf: in SimpleStackBasedCogit>>compileFrameBuild (header is in hand). Set the inBlock variable in scanBlock and scanMethod. Change the argument to the needsFrameFunction to be the stack delta. Change genSpecialSelectorClass's needsFrameFunction to needsFrameIfStackGreaterThanOne: which better handles e.g. TextColor>>dominates: other ^self class == other class than needsFrameIfFollowsSend:. Fix StackToRegisterMappingCogit>>genPrimitiveClass for case where numArgs > 0 (must use other than ReceiverResultReg, e.g. objectClass:). Improve register usage in genSpecialSelectorClass. Save & restore methodOrBlockNumTemps in compileBlockBodies for symmetry with needsFrame & methodOrBlockNumArgs. Slang: fix formatting of WhileForeverBreakIf loops. Tweak startPCOrNilOfLiteral:in: to filter-out arrays from its check for closure literals. Tweak frameless inst var store to avoid a register copy from Arg0Reg. * Make primitiveObjectAtPut fail if changing the header word and the new header has a different literal count. Avoids crashing the VM when inadvertently changing the header, as a Newspeak bootstrap did recently. * Implement frameless inst var store from arguments, so e.g. Point>>setX:Y: is frameless. * Add Cogit support for clean blocks by scanning literals looking for BlockClosures on the current method. Add a prevBCDescriptor to the StackToRegisterMappingCogit scanners. This enables needsFrameIfFollowsSend:. Avoid spilling in frameless methods (now that pushTempVar can be frameless and hence inst var setters such at setX:Y: are now frameless) by adding needsFrameIfFollowsSend: used for special selectors #class & #==. Merge 2763. -memory and SQUEAK_MEMORY handle more than 31 bits of extra memory on 64-bit architectures. ------------------------------- Jun 28 2014 CogVM binaries as per VMMaker.oscog-eem.317/r2761 Fix bug in transferTo:(from:) when doing a code compaction when ensuring there is a machine code method when switching to a process whose context has a machine code pc. Limit the ammount of space the Cogit will stack allocate when compiling. This limits the maximum number of bytecodes in a method that the Cogit will compile. Currently set at 1.5Mb of stack space from empirical tests of alloca on Mac OS X 10.6, linux 2.6 & Windows XP. Fix become for cog methods that are not paired with their bytecoded methods (e.g. Newspeak accessors). Correct several uses of literalCountOf:, using LiteralStart instead of 1, and BytesPerOop instead of BytesPerWord. Eliminate dead code around contextInstructionPointer:context:. Eliminate duplicate methodClass asserts in ce*(Send: and simplify some in code compaction & code freeing. Don't inline freeStackPage: Revise the inlining change. Global vars passed as parameters must not be read after any non-trivial call. Use CCodeGenerator>>isAssertSelector: to check for all assert: calls (these are not inlined). hence fix assert:l: uses. Add an assert to commenceCogCompiledCodeCompaction to catch the actual bug (pushing the instructionPointer twice). Improve inlining via inlineSend:directReturn:exitVar:in: by refactoring argAssignmentsFor:args:in:'s innards. Now global variables are inlined if they are only read within the code being inlined. Implement warningat in term of warning so one only has to remember to set a breakpoint in warning, not both. Add tracing of GCs and code compactions to primTraceLog. primitiveTerminateTo needs one more assertValidStackedInstructionPointers: Fix some assert:s that should be assert:l:s Use assertValidStackedInstructionPointers: in primitiveTerminateTo. Fix the assert to use framePointer when on current page and instructionPointer ~= 0. Fix assertValidStackedInstructionPointersIn:line: usage in commenceCogCompiledCodeCompaction. Simplify relocateCallBeforeReturnPC:by: and elide bogus use of signedIntToLong there-in. Add a guard to findClassOfMethod:forReceiver:. Add an assert that checks all the instructionPointers in all stack pages. Use the assert in code compaction. (tracking down a rare crash at Cadence). Beef up the assertValidExecutionPointe:r:s:imbar:line: assert for interpreted frames (i.e. check savedIP if pc is ceRetrnToInterpreter) Freshen the translation of some of the VM plugins (for loop limit issue). Add new run & single-step prims to the processor sumlator plugins so access to memory past freeStart can be checked if needed. Add an assertValidExecutionPointers to the front side of process switch. Merge initializeCompilationWithConstantsOptions: into initializeMiscConstantsWith:. Make deferStackLimitSmashAround:[with:] answer true so it can be invoked in an assert and hence be optimized out in a non-assert VM, hence optimizing away assertValidExecutionPointe:r:s:imbar:line:. Add assert:l: and asserta:l: which take line numbers. Refactor assertValidExecutionPointe:r:s:imbar: to take a line number and supply it to assert:l: et al, for more informative assert failures. Use sqLowLevelMFence in deferStackLimitSmashAround:with: et al. Refactor preambleCCode emission so that a comment indicating its source is generated. Eliminate some compiler warnings in pathTo:using:followWeak:. ------------------------------- Jun 28 2014 CogVM binaries as per VMMaker.oscog-eem.302/r2749. Fix bad performance regression that on certain platforms (linux) results in all send misses causing a discarded PIC creation followed by a slow hash lookup. Update the BitBltPlugin to include the fast Arm ASM option. Fix type errors in the Cogit that prevent the Cogit working when compiled with clang. Specifically void * pointers are not comparable. Make sure that fetchPointer:ofObject: & isIntegerValue: are declared in cointerp.h. Fix bug when assigning to some context inst vars from generated methods. Add accessors to document the context inst var access scheme. Fix a compiler warning comparing an error code in cog:selector: Update the Newspeak version of the VMProfileLinuxSupportPlugin. Improve robustness of the nscogbuild/unixbuild mvm scripts. Add the linux policy change dance to sqUnixVMProfile.c (cf sqUnixHeartbeat.c). Fix a bail_out typo. Change the VMProfileLinuxSupportPlugin to follow symlinks, answering pairs of module name, dereferenced symlink or nil. Fix 3 (!!) bugs in primitiveDLSymInLibrary. ------------------------------- Jun 28 2014 CogVM source as per VMMaker.oscog-eem.296/r2732. Make the Production/Debug/Assert distinction static so one can find it with strings - vm. Add an ident for the ITIMER_HEARTBEAT linux VMs. Fix bug in eden filling/object overwrite checking. must use unsigned vars in the fill loop. Add a plugin to support the VMProfiler on linux (now Steve Rees has told us how to get proper thread priorities on newer linuxes). Replace broken primitiveUtcWithOffset with a version that works. Cast primitiveUtcWithOffset in terms of a new ioLocalSecondsOffset. Add stricter checking code to OSProcessPlugins. Add SqueakSSL plugin to nsvm plugins. Change -version output to print if a Production, Debug or Assert VM. Add some error checking to the UnixOSProcessPlugin's forkAndExecInDirectory prim. Eliminate some compiler warnings in the plugin. Eliminate some excessive use of push/popRemappableOop[:] in the SocketPlugin. Fix numberic option parsing in sqSocketSetOptions... (quite possibly the longest C function name I've ever seen). The old code only supported 1-digit length values, so setsockopt(SO_SNDBUF, 4096 didn't do what was expected at all (attempted to set the buffer size to 909717556, which is little-endian ascii for '4096', '4096' asByteArray unsignedLongAt: 1 => 909717556). Occasionally bizarre interactions cause the heartbeat's interval timer to disable. e.g. on CentOS linux when using PAM to authenticate, a failing authen- tication sequence disables the interval timer, for reasons unknown (setting a breakpoint in setitimer doesn't show an actual call). So a work around is to check the timer as a side-effect of ioRelinquishProcessorForMicroseconds. Add Callback LIFO ordering support to the new-style Callback return primitives to the IA32ABI Aline plugins. Fix some compiler warnings there-in. Improve the debug message printing in sqUnixExternalPrims.c so it prints line numbers. ------------------------------- Jun 28 2014 CogVM binaries as per VMMaker.oscog-eem.282/r2714 Change application name from Croquet to Squeak and change to green Cog Squeak icons. Add accurate version info to the Windows exes (include SCCSID in properties). Add the SSLPlugin on mac and linux (won't link on old mingw; need to update). Cogit: Fix *HORRIBLE* yet ancient bug with the CogObjectRep. Both CogObjectRepresentationForSqueakV3>>couldBeObject: & CogObjectRepresentationForSqueakV3>>shouldAnnotateObjectReference: used signed comparisons for oops and so once the heap size pushes oops into the upper half of the address space constant oops in machine code were no longer being updated by the GC. StackInterpreter: rewrite the login for printing methods so that printing the frame of a bad receiver won't seg fault. Add primitivePathToUsing which mimics the PointerFinder and can hence be used to debug or verify it. Remove unnecessary forceInterruptCheck in NewObjectMemory>>become:with:twoWay:copyHash:. (heartbeat does this for us). Fix bug in assert in NewCoObjectMemory>>restoreHeaderOf:to:. Slang: Fix translation of to:by:do: loops so that the limit is not re-evaluated on each iteration if it may have side-effects. Include the correct AioPlugin (UnixAioPlugin) Include SqueakSSLPlugin in the configurations. Cogit: Stop reporting EncounteredUnknownBytecode with an error message. Fix slip in ceSICMiss: that didn't link new PIC if an MNU case. Add pixel peeker prims to BitBltPlugin. Minor signature changes to BitBltPlugin & HostWindowPlugin from SmartSyntaxPluginGenerator bug fix. ------------------------------- Jun 28 2014 CogVM binaries as per VMMaker.oscog-eem.275/r2705. Remember to flush PushImplicit/SendAbsentImplicit caches on global cache flush and flush cache by method. Fix comment for flush-cache-by-method workhorse. Fix PC-mapping for NewspeakV4. Absent receiver sends must not be maped twice, once for IsNSSend and once for IsSend. So introduce class vars that state whether instruction set uses PushImplicitReceiver (NSSendIsPCAnnotated is true) or SendAbsentImplicit (NSSendIsPCAnnotated is false). ------------------------------- Jun 28 2014 CogVM binaries as per VMMaker.oscog-eem.272/r2701. Fix unknownBytecode processing to leave pc at unknown bytecode. Fix case of process switch to an interior frame. Fix some assert function signatures in the stack vm. Use symbols for types instead of strings in stack page funcs. Fix the become issue where methods that are identical are failing the code test because their penultimate literals are different objects. Add a flag "cmUsesPenultimateLit" to jitted methods, stealing bits from stackCheckOffset (which was way larger than needed). Shrink stackCheckOffset to 12 bits (still an order of magnitude larger than needed) and add an error check on assigning it. Also add a check for max method size (2^16-1 bytes) and refuse to jit a method that generates too much code. When comparing code, use the cmUsesPenultimateLit flag to decide if comparison includes penultimate lit or not. This is mildly insane, but the VM really doesn't know about the penultimate literal and it shouldn't depend on knowing it can be ignored. Note that the CoInterpreter knows about the last literal; it uses this in supersends. With this hack, Pharo's condenseSources works. ------------------------------- Jun 28 2014 CogVM binaries as per VMMaker.oscog-eem.270/r2697 Fix the become issue where methods that are identical are failing the code test because their penultimate literals are different objects. Add a flag "cmUsesPenultimateLit" to jitted methods, stealing bits from stackCheckOffset (which was way larger than needed). Shrink stackCheckOffset to 12 bits (still an order of magnitude larger than needed) and add an error check on assigning it. Also add a check for max method size (2^16-1 bytes) and refuse to jit a method that generates too much code. When comparing code, use the cmUsesPenultimateLit flag to decide if comparison includes penultimate lit or not. This is mildly insane, but the VM really doesn't know about the penultimate literal and it shouldn't depend on knowing it can be ignored. Note that the CoInterpreter knows about the last literal; it uses this in supersends. With this hack, Pharo's condenseSources works. Fix bug in primitiveClone/cloneContext: that causes the copy to be a word short. Use isPointerNonInt: and isContextNonInt: in a few places. Implement unknownBytecode. Send unknownBytecode to the activeContext on unknown bytecode if the selector is in the specialObjectsArray. Fix page size bug in platforms/Cross/vm/sqHeapMap.c. ------------------------------- Jun 28 2014 CogVM binaries as per VMMaker.oscog-eem.266/r2693. Support one-way become on cogged methods that have the same code, for e.g. Pharo's setSourcePosition:inFile:. Add error checks for two-way becomming cogged methods, becomming married contexts, and for freeing any of these during become. Refactor freeObject: and restoreHeaderOf: to allow subclasses to add their error checks efficiently (i.e. avoiding fetching baseHeader more than once). Make assert in rawHeaderOf:put: accept forwarding. Tiny speed-up in using byteLengthOf: instead of byteSizeOf: in cogit. Add asLong to CCodeGenerator and there-by eliminate printf warnings in reportMinimumUnusedHeadroom. Eliminate warning in instVar:ofContext:put:. Fix bug in assigning pc which can cause stackPage to not be most recently used. Fix short frame printing (eliminate an extra newline and use hex for receiver). Provide a -rh shorthand for -reportheadroom. ------------------------------- Jun 28 2014 CogVM binaries as per VMMaker.oscog-eem.264/r2678. Fix snapshot primitive failure in the StackVM and Cogit. The primitive should fail, not merely return the receiver. Also if in Cogit, need to back-up instruction pointer on failure. Make reportMinimumUnusedHeadroom more informative (also print available headroom). ------------------------------- Jun 28 2014 CogVM binaries as per VMMaker.oscog-eem.261/r2677. Move determination of the ammount of headroom to the platform in osCogStackPageHeadroom (in the various sqFooMain.c files). Hence 2k stack pages on Mac and Win32 with 4k pages on linux. Provide a routine to monitor the ammount of unused headroom, which requires the stack memory be zeroed before use. Assume the platform will provide a -reportheadroom flag for enabling the report. Provide primitiveMinimumUnusedHeadroom for in-image access. Add some asserts to check that a page's frame pointer is always in range (setHeadFP:andSP:inPage: already did this). ------------------------------- Jun 28 2014 CogVM binaries as per VMMaker.oscog-eem.258 Fix becomeForward: when the rootTable overflows. There were two bugs here. One is that initializeMemoryFirstFree: used to clear the needGCFlag so if the rootTable overflowed noteAsRoot:headerLoc:'s setting of the needGCFlag would be undone after the sweep. The other is that rooitTable overflow was indicated by rootTableCount >= RootTableSize which could be undone by becomeForward: freeing roots which need to be removed from the rootTable. At some point in becomeForward the rootTable would fill but at a later point a root would be freed, causing the table to become not full. The fix is two fold. 1. Add an explicit rootTableOverflowed flag instead of relying on rootTableCount >= RootTableSize. 2. move the clearing of the needGCFlag to the GC routines. Remove unnecessary senders of needGCFlag: false, and remove the accessor. As a side effect rewrite primitiveRootTable in terms of a new ObjectMemory>>rootTableObject. Remove the rootTable: accessor. Implement checkAllAccessibleObjectsOkay & checkOkayInterpreterObjects: (used to debug the above). Fix NewObjectMemory initialization to set freeStart at the same time as setting endOfMemory. This allows load-time scans and assert code to use freeStart instead of endOfMemory. Simplify markAndTraceStackPage: ; since the two implementations are distinct they don't need to contain the isCog if-then-else. Implement NewObjectMemory>>shorten:toIndexableSize: so that the last object is correctly shortened (cut back freeStart). Refactor the allocation check filling code into maybeFillWithAllocationCheckFillerFrom:to:. Make longPrintOop: print the class oop. Fix bug in printCallStackOf:currentFP: for widowed contexts. Use fputs for print: instead of printf. Rewrite the stackLimit computation after a moment of clarity. Allow the system to reduce the space for frames by up to an 1/8th. Make sure there's at least as much headroom as asked for. This changes the stack page size from 4096 to 2048 and much reduces the interpreterAllocationReserveBytes. Don't round up interpreterAllocationReserveBytes to a power of two. Integrate the named serial primitives plus Luc Fabresse's latest fix. Improve cygwin HoiwToBuilds with info on latest versions (thanks Ron). Map SO_REUSEPOR to kIP_REUSEPORT instead of kIP_REUSEADDR in sqMacNetwork.c. This doesn't fix the Mac Socket test failures because the Mac build uses the unix code. Sigh. ------------------------------- Jun 28 2014 CogVM binaries as per VMMaker.oscog-eem.255/r2672 [New[Co]]ObjectMemory: Fix becomeForward: so that objects whose references are deleted are freed and can no longer be resurrected via allObjects or allInstances. Remove freed young roots from the rootsTable. Filter freed objects pointed to from the extraRootsTable (because these locations can change it is wrong to remove entries from the extraRootsTable). Make primitiveIdentityHash pop all arguments, for Newspeak VMMirrors. StackToRegisterMappingCogit: Fix marshalling of absent receiver sends. The items beneath the arguments (and to-be-pushed receiver) must be spilled before the receiver is pushed. Improve code quality for numArgs > numRegArgs sends when receiver is not a spill and there are no uses of ReceiverResultReg amongst args. e.g. now avoids loading ReceiverResultReg from stack in code such as 1 with: 2 with: 3. ------------------------------- Jun 28 2014 CogVM binaries as per VMMaker.oscog-eem.253/r2669. Implement absent receiver dynamic super send in the Cogit. ------------------------------- Jun 28 2014 CogVM binaries as per VMMaker.oscog-eem.252/r2664. Issue 117. Fix primitiveRemLargeIntegers. The result should be negated iff receiver negative. ------------------------------- Jun 28 2014 CogVM binaries as per VMMaker.oscog-eem.251/r2662 Give primitiveRemLargeIntegers primitive # 20. Add yet another libc line to the linux launch script(s), and try and make the script suggest users extend it themselves. you can lead a horse to water... Fix (old) bug in ssAllocateRequiredRegMask:upThrough: that would flush entire stack if allocating any register. Implement absent receiver sends in the Newspeak Cogit. On Mac turn off inlining when compiling the Cogit. The Cogit's runtime is negligible and we prefer to save space. With the recent changes (better shift code??) the VM appears faster anyway. Integrate directed shift changes from cog issue 111 that affect the CoInterpreter and Cogit. Make the translated primitive plugins include info from their primitive supplying classes. Integrate changes from VMMaker-dtl.293 which use >> & << shifts in place of slower bitShift: code in plugins. Add width failure cases to BMPReadWriterPlugin read & write 24Bmp prims. Integrate issue 115. Fix FFIPlugin/ThreadedFFIPlugin unsignedShortAt:. ------------------------------- Jun 28 2014 CogVM binaries as per VMMaker.oscog-eem.240/r2640 Restore ThreadedFFIPlugin wanting COGMTVM to be determined on command line. Probably broke in VMMaker.oscog-eem.218 Back out of the wrong-headed attempt to give compact class indices to long header objects in changeClassOf:to:, and comment why (markAndTrace: reuses header type bits and depends on compact class and size fields to reconstruct type bits after traverse). Fix off-by-one error in okayOop:. Make longPrintOop: print header type info. Make allAccessibleObjectsOk answer a result. Don't inline loadInitialContext for gdb breakpointing convenience. ------------------------------- Jun 28 2014 CogVM binaries as per VMMaker.oscog-eem.238/r2637 Restore ThreadedFFIPlugin wanting COGMTVM to be determined on command line. Probably broke in VMMaker.oscog-eem.218 Fix bug in changeClass:from: so that if receiver has long header and class is compact, receiver still gets compact class field set, not cleared. No matter what header an instance has, if its class is compact it should have the compact class index set. ------------------------------- Jun 28 2014 Cog VM binaries as per VMMaker.oscog-eem.234/r2636/r2638. Use the -z now link flag on the linux builds. This causes the dynamic linker to resolve unresolved symbols on load instead of lazily. This affects reliability in signal handlers, because if the dynamic linker can run at any time it can therefore run in a signal handler and cause a deep call chain which could corrupt a stack page in the JIT. So this applies three fixes to this issue: a) correct the stack headroom determination b) use sigaltstack for signal handlers in the UnixOSProcessPlugin c) link using -z now on linux. Use SA_ONSTACK/sigaltstack for signal handlers installed by the UnixOSProcessPlugin to avoid running signal handlers on the JIT's stack. (r2638: Check for needing sigaltstack properly in setSignalNumber:handler:). ==== VMMaker.oscog-eem.233/r2632 Rename misnamed internalIsMutable: and internalIsImmutable: to isOopMutable: and isOopImmutable:. Affects sqVirtualMachine.c, but only part of api used in Newspeak VMs. Merge LargeInteger primitive fixes from VMMaker-dtl.286 and tests from VMMaker-dtl.289. UnixOSProcessPlugin: Merge with VMConstruction-Plugins-OSProcessPlugin-dtl.35. In particular restore missing code to forwardSignal:toSemaphoreAt: Get plugin to use SA_ONSTACK/sigaltstack for signal handlers if loaded in the JIT. ==== VMMaker.oscog-eem.230/r2631 Fix stackPage headroom calculation in CoInterpreter. Was getting things backward. This increases the stack page headroom by 62% from 1564 bytes to 2532 bytes. Shootout benchmarks unchanged, so reduction in frames per page is not an issue for typical code. This should result in fewer crashes on linux where the dynamic linker, if it kicked in during a signal handler, could cause a deep call chain at interrupt time and trample the start of the adjoining stack page. Merge LargeInteger primitive fixes from VMMaker-dtl.286. Change generation of plugin code so that internal plugins call VM routines directly and external plugins call through their own local copies of the function pointers in InterpreterProxy. External plugins copy the InterpreterProxy functions to their local copies in setInterpreter:. Change implementations of stObject:at:put: to return their value, to match the declaration in InterpreterProxy. Streamline ObjectMemory>>instantiateClass:indexableSize: (hdrSize and header3 change together). Optimize the debug VM by making startOfMemory a macro that answers heapBase instead of a method. Improve stack page printing, and make stack trace printing more robust (findClass/SelectorOfMethod:forReceiver:). Make temporary:in:put: et al answer their values. For stObject:at:put:. Make jumpTable size err message more explanatory. Rename misnamed internameIsMutable: and internalIsImmutable: to isOopMutable: and isOopImmutable:. Affects sqVirtualMachine.c, but only part of api used in Newspeak VMs. ------------------------------- Jun 28 2014 CogVM binaries as per VMMaker-oscog.eem.227/r2628. Fixes for cog issue 109, base frames and CallPrimitive, and merge with VMMaker.oscog-lw.224. Cogit: StackToRegisterMappingCogit, cog issue 109. Fix pc mapping for popped folded constants as in, e.g. 1-1. Need to check for annotateUse on popping a stack descriptor. CoInterpreter: Fix makeBaseFrame: for methods with CallPrimitive that get restarted. e.g. on:do:. We could make the CallPrimitive bytecode check for being at the start of a method, but I have decided, for strictness, to make executing CallPrimitive an error for the moment. This means that thater we could use it to embed primitive calls in the middle of methods. Streamline once again activation sequence to make setting method (actually bytecodeSetSelector) faster and on primitive failure to increment pc past CallPrimitive before checking for err code store. Make ensureMethodIsCogged: answer the cogged method, again for efficiency. Add tracing of stack overflows. Fix bug in printing of bytecode addresses in long/printOop: on CompiledMethods. Fix bug in printStringOf: so that it prints ... when truncating. Add missing case to isNullExternalPrimitiveCall:. Streamline long/printReferencesTo:. Slang: Improve formatting of code for cppIf:ifTrue:. Complete multiple bytecode set support plus NewsqueakV4 bytecode set. Affects only the Newspeak VMs. Make instantiation primitives pop arguments, not assume arg count, for Newspeak. Make them answer error codes, and streamline, avoiding using self success. Integrate primitiveUtcWithOffset. Rename traceLinkedSends to more general traceFlags. Add better help for these. Plugins: Regenerate most plugins, to eliminate some warnings and to use isIntegerObjectrather than explicit bit test. ThreadedFFIPlugin: Fix bug with not attempting to run GC enough times for COGMTVM to freeze arguments. Fix bug in ffiCall:ArgArrayOrNil:NumArgs: not checking for an error case. ------------------------------- Jun 28 2014 Cog VM binaries as per VMMaker.oscog-eem.201/r2585. Make sure youngReferrersList has room for every method since become/cache implicit receiver can cause any method to gain a young reference. Do so by counting methods in the zone. Make overflowing the youngReferrers list a hard error (appears to happen quite often in Newspeak code). Fix assert in interpretMethodFromMachineCode. Fix bug in changeClassFrom:to: if receiver is a compact class instance with a large header. ------------------------------- Jun 28 2014 Unix Cog VMs as per VMMaker.oscog-eem.163/r2562 Fix the unix builds for 64-bit file support (UnixOSProcessPlugin got broken). Hence make the unix build scripts nuke everything by default. ------------------------------- Jun 28 2014 CogVM binaries as per VMMaker.oscog-eem.163/r2559. Make wakeHighestPriority filter-out zombie processes; fixes Newspeak/Glue crash. Add -blockonerror flag to Unix & Mac VMs to allow attaching gdb on error/segv. Make the sigsegv handler catch SIGILL and SIGBUS on Unix and Mac. Add 64-bit file support to linux builds. Fix sqUnixX11.c ClipChildren to ClipByChildren ------------------------------- Jun 28 2014 CogVM binaries as per VMMaker.oscog-eem.161/r2556. Integrate ipv6 socket primitives. Add UnixOSProcessPlugin primitiveDup, and make primitiveChdir return error and success results the right way round. Add the BochsIA32Plugin to linux and win32 so that the Cog VM simulator runs on all platforms (see image/CogTrunk43.image). ------------------------------- Jun 28 2014 CogVM binaries as per VMMaker.oscog-eem.157/r2550. Stack/CoInterpreter/Cogit: Implement proper bounds checking for byte access to compiled methods. Raise errors for accesses outside initialPC to size. CoInterpreter: Provide a thorough flush primitive for CompiledMethods that discards all machine code and makes sure that any contexts using the method have bytecode pcs. Primitive #215 (same as 116 in the Stack VM). This is much slower than 116 (flushCache) since it has to enumerate over all heap contexts. Provide an xray primitive for CompiledMethod that answers if a method has machine code, and if so if it's machine code is frameless, and/or refers to a young object. No primitive number. Used to test the above. Make printOopShort: print Association keys. Useful for longPrintOop:, and hence printReferencesTo: etc. Mac OS: add fflush to debug printing in sqMacUIEventsUniversal.c so output appears promptly. Fix the annoying bogus error messages from the mprotect calls by getting the length arg to mprotect right. Add version infor for the Cross/plugins tree. Add a -version switch to win32. ------------------------------- Jun 28 2014 Cog VM binaries as per VMMaker.oscog-eem.154/r2540. Fix bad conceptual bug with become on methods. Unlike full and incremental GC, the reference from a Cog method to its method object must not be remapped since they're two halves of the same object. Fix FileStreamTest>testPositionPastEndIsAtEnd on unix & Mac OS. Merge Merge VMConstruction-Plugins-OSProcessPlugin-dtl.33's changes. Regenerate the apparently truncated nscogsrc ZipPlugin.c. Handle more libc variations for LD_LIBRARY_PATH in the unix launch scripts. ------------------------------- Jun 28 2014 Cog VM binaries as per VMMaker.oscog-eem.152/r2538. Fix tricky context state bug that can cause crashes in the GC. Contextpart>runUntilErrorOrReturnFrom: may pop the last argument. Hence if a frame is married in this state its spouse context won't have its last argument slot initialized. If later the frame is divorced and the arguments are not updated the last argument slot can be left with this uninitialized slot and ... bang. The solution is to update arguments as well as stack contents when divorcing. This is good also because it removes the VM's assumption that method arguments are read-only, and that's only enforced by the Smalltalk bytecode compiler, not by the bytecode. ------------------------------- Jun 28 2014 CogVM binaries as per VMMaker.oscog-eem.150/r2537 Disable filling of the weakRoots table during fullGC. Report remappable oop and weak root table overflows as errors. Make primitiveObjectAtPut fail if used to store other than a SmallInteger into the method header. Better error message for inability to thread. Load function pointer early in ThreadedFFIPlugin's invocation checking sequence. This must come early for compatibility with the old FFIPlugin. Image-level code may assume the function pointer is loaded eagerly. Add 1007,1008 & 1009 attribute info to version info on linux, & 1009 on Windows. Add -version option to Mac VM (with similar info to linux). Include vm-display-fbdev on linux. Exclude fbdev in the newspeak build. Simplify ticker for older linuxes. ------------------------------- Jun 28 2014 CogVM binaries as per VMMaker.oscog-eem.140/r2522 Avoid localizing backward jump count variables to interpret so that optimized and unoptimized VMs behave the same. On unix and Mac OS print the error message at the end of the stack dumps as well as before, so one can see error messages more readily. Fix regression with new command line parsing on Windows. If no image supplied and a GUI application open the image file open dialog. If a console app, print usage and quit. ------------------------------- Jun 28 2014 CogVM binaries as per VMMaker.oscog-eem.139/r2519. Change the VM argument access back to be consistent with linux: 0 => executable name -1 .. -n => VM arguments *including* image (if explicitly supplied). Provide standard i/o console access on Windows. ------------------------------- Jun 28 2014 CogVM binaries as per VMMaker.oscog-eem.139/r2518. Add access to VM arguments to Mac. Make Windows, Mac and Unix consistent: Smalltalk getSystemAttribute: -1 => executable name -2 .. -n => VM arguments *including* image (if explicitly supplied). Replace win32 command line parsing with unixesque code using CommandLineToArgvW. linux launch script: Cope with old linuxes use of /lib/tls hack for an optional thread-local-storage version of libc/libpthread. The launch scripts add /lib/tls:/usr/lib/tls: to LD_LIBRARY_PATH instead of /lib:/usr/lib: if the VM is linked to /lib/tls/libc. ------------------------------- Jun 28 2014 CogVM binaries as per VMMaker.oscog-eem.139/r2515. Fix for frameless "foo: arg instVar := instVar" code gen bug, do not flush the top of stack when pop/storing receiver and/or temp vars. i.e. ssFlushUpThroughReceiverVariable: & ssFlushUpThroughTemporaryVariable: skip the entry at simStackPtr. ------------------------------- Jun 28 2014 CogVM source as per VMMaker.oscog-eem.137/r2508. Fix primitiveContextAt[Put] for non-contexts (ContextPart subclasses other than MethodContext including BlockContext). stObjectAt[Put] could fail so args should be popped only on success. Fixes failures in ClosureCompilerTest>testSourceRangeAccessForBlueBookInjectInto as of VMMaker.oscog-eem.118. Add convenient shortPrintFrame:AndNCallers: for debugging. Make the unix launch script include /lib & /usr/lib in LD_LIBRARY_PATH if LD_LIBRARY_PATH is unset. Modify the invocations of ex in the editing scripts to not read ~/.exrc, and hence not be confused by e.g. set ignorecase. Fix error codes for at: and at:put: primitive so that for non-indexable objects they fail with #'bad receiver', not #'bad index'. Filter-out attempts to create MNU pics with new selectors. Fixes Stephane Rollandin's crash of 21/10/2011: [((Compiler new evaluate: ('Beuh' ifNuk: [yo])) on: Error do: [:ex | ex description]) printString] fork ------------------------------- Jun 28 2014 CogVM binaries as per VMMaker.oscog-eem.134/r2502 Ensure the unix run scripts set SQUEAK_PLUGINS if unset. Fix remaining bug in context access fixes of VMMaker.oscog-eem.119. stObject:at: and stObject:at:put: need to use stackPointerForMaybeMarriedContext: not fetchStackPointerOf:, since the context's stack pointer may be stale. Fix send trace printing. Interpreter sends need also to be printed. (A better fix is probably to redo sendBreakpoint: but this will serve for now). N.B. this reassigns the sendtrace flag values. Add primitiveNotEquivalent with prim # 169. ------------------------------- Jun 28 2014 CogVM binaries as per VMMaker.oscog-eem.133/r2499. Tiny performance tweak to bytecodePrimMultiply. Added findStringBeginningWith: debugging facility. Fix yet another slip in Cogit>>lookup:for:methodAndErrorSelectorInto: for cannotInterpret: cases. Fixes Mariano's crash as of 2011/10/03. Add -numextsems command-line argument to set size of external semaphore table on start-up. ------------------------------- Jun 28 2014 CogVM binaries as per VMMaker.oscog-eem.128/r2496 Fix regression in object-as-method/cannot-interpret for single and polymorphic inline cache misses (lookup:for:methodAndErrorSelectorInto:). Fix formatting bugette in context printing. ------------------------------- Jun 28 2014 CogVM binaries as per VMMaker.oscog-eem.127/r2495. Fix MNU PIC creation not to answer an error code to the interpreter. This is the call 0x00000013 bug. ------------------------------- Jun 28 2014 CogVM source as per VMMaker.oscog-eem.126. Fix cPICEndSize mis-computation caused by using rounded-up closedPICSize. Compute cPICEndSize and /then/ round-up closedPICSize. ------------------------------- Jun 28 2014 CogVM binaries as per VMMaker.oscog-eem.125/r2493. Add callsite link/relocate checks to catch the call 0x00000013 MNU callsite relinking bug. Reduce the size of the simStack to something proportional to LargeContextSize. In the Newspeak VM, don't cd to the image's directory on win32. Fix off-by-one error in Win32OSProcessPlugin>primitiveGetCurrentWorkingDirectory so it doesn't include an erroneous trailing null. Fix the 1Gb allocation bug. ------------------------------- Jun 28 2014 CogVM source as per VMMaker.oscog-eem.121/r2489. Clean up compilation warnings. Add a hard check on Cogit overflowing the number of allocated abstract instructions. CoInterpreter/StackInterpreter: Fix object accessing prims (at:, at:put: & size 60, 61 & 612) for contexts, because primitives 60-62 are used for the mirror primitives in ContextPart (object:at: et al). Fix now obsolete, but still used primitiveContextAt et al (primitives 210, 211 & 212) to be varargs, since these might also be used from mirror primitives. Pull the temporary:in:[put:] code into a non-inlined wrapper to avoid bloating the common case. Cogit/CogObjectRepresentationForSqueakV3: Fix genInnerPrimitiveAt: & genInnerPrimitiveSize to fail for context receivers. Fix genInnerPrimitiveAt:, genInnerPrimitiveStringAt: & genInnerPrimitiveSize to call the interpereter primitive on failure, to get the error code - not yet avoiding the call if the method doesn't use the error code; one thing at a time. ------------------------------- Jun 28 2014 CogVM source as per VMMaker.oscog-eem.117. Fix bugs described in the "[Pharo-project] Troubles with #flushCache and #run:with:in:" thread http://lists.gforge.inria.fr/pipermail/pharo-project/ 2011-July/050858.html. The PIC machinery wrongly treated invoke-as-method sends as MNUs. Closely related, finally fully implement PIC MNU cacheing where, by calling a special abort, a PIC is able to record that a gven selector is an MNU for a particular class. Speeds up a simple MNU benchmark by 33% (with more performance the deeper the receiver's cass hierarchy is). On Mac fix mis-editing of Info.plist to insert revision info so that VM .app once again starts on 10.5.x. Update Pharo icon to the nice Cog one. ------------------------------- Jun 28 2014 VMMaker.oscog-eem.115/r2486 Include the rule 41 code in the BitBltPlugin. Fix regression in linux mt build mvm script. ------------------------------- Jun 28 2014 CogVM source as per VMMaker.oscog-eem.114. CoInterpreter: fix bad bug with primitiveClone of a compiled method. The clone should be an uncogged method irrespective of the state of the receiver. Symptoms include rare crashes in become: during MethodDictionaryTest which copies a test method. Cogit: Fix potential bug in become: with cogged methods. Since become can cause an object to gain a new reference cogged methods that gain new references must be added to youngReferrers. ------------------------------- Jul 30 2011 Initial VM release artifacts for the summer 2011 Newspeak release. Windows installer: nsvm-11.30.2476.msi (11 = 2011, 30 = week 30) Mac OS installer: Newsapeak Virtual Machine.dmg Linux tar: nsvmlinux.tgz VMs as per VMMaker.oscog-eem.113/r2476 Refuse to JIT closure no-context-switch value until the relevant offsets have been computed. ------------------------------- Jul 26 2011 CogVM binaries as per VMMaker.oscog-eem.112. Newspeak: Add missing incremental GC code for implicit receiver cache. Fix markLiteralsAndUnlinkIfUnmarkedSendOrPushImplicit:pc:method: for empty cache. Fix Mac VM to use correct Newspeak document icons for source files et al. All: Alien plugin now marshalls Floats as doubles directly. N.B. For Squeak/Pharo/Croquet please use the archives whose name begins with Cog. The archives whose names begin with nsvm or Newspeak are for Newspeak and are missing plugins required by Squeak/Pharo/Croquet. ------------------------------- Jul 20 2011 CogVM source as per VMMaker.oscog-eem.107. Fix UUID plugin for linux Fix conversion of max neg int in signed32BitValueFor: (fixes Alien plugin). N.B. For Squeak/Pharo/Croquet please use the archives whose name begins with Cog. The archives whose names begin with nsvm or Newspeak are for Newspeak and are missing plugins required by Squeak/Pharo/Croquet. ------------------------------- Jun 28 2014 CogVM source as per VMMaker.oscog-eem.104. Use StackToRegisterMappingCogit for Newspeak VMs. Pull inline cache check for implicit receiver into trampoline. Fix ceImplicitReceiverFor:receiver:class: cache write to add method to youngReferrers if either class or mixin are young. Fix Newspeak main window opening on Win32. ------------------------------- Jun 28 2014 CogVM source as per VMMaker.oscog-eem.104. Use StackToRegisterMappingCogit for Newspeak VMs. Pull inline cache check for implicit receiver into trampoline. Fix ceImplicitReceiverFor:receiver:class: cache write to add method to youngReferrers if either class or mixin are young. ------------------------------- Jun 28 2014 Cog Squeak and Newspeak VMs for VMMaker.oscog-eem.101. Null-pointer checks in Alien accessing primitives. The Newspeak VMs are also Cog JITs. The Newspeak JIT now avoids duplicating accessor methods. ------------------------------- Jun 28 2014 Cog Squeak and Newspeak VMs for VMMaker.oscog-eem.98. Functional Alien calls & callbacks. The Newspeak VMs are also Cog JITs. ------------------------------- Jun 19 2011 Fix linux Alien call-out crashes. Linux x86 must use standard 16-byte alignment. ------------------------------- Jun 17 2011 OSCogVMs as per VMMaker.oscog-eem.78/r2434. Include a linux Newspeak VM (invoke via nwvmlinux/squeak my.image). No functional changes w.r.t. earlier versions, but these VMs have been built from more organized source (you don't want to know). ------------------------------- Jun 15 2011 Newspeak VMs as per VMMaker.oscog-eem.76/r2428. Fix primitiveChangeClass and primitiveAdoptnstance in the NewspeakVM to allow changing the class of the receiver to a compact class since one can always set the class field to a class even if that class is a compact class; one just ends up with a non-compact instance of a compact class. Of course, the other way around can't work without a become:. ------------------------------- Jun 10 2011 OSCogVM Mac binaries as per VMMaker.oscog-eem.75/r2424. In the Carbon Mac OS platform, arrange for the Smalltalk window to move to the main display when on a secondary monitor that is removed. Yet to arrange a screen update after the move. Help in this much appreciated. ------------------------------- Jun 7 2011 OSCog binaries as per VMMaker.oscog-eem.75. Fix primitiveExecuteMethodArgsArray for num args > 2. ------------------------------- Jun 7 2011 Newspeak VMs as per VMMaker.oscog-eem.73. Newspeak's VMMirror requires a 4 argument primitiveExecuteMethodArgsArray. ------------------------------- Jun 6 2011 OSCogVMs as per VMMaker.oscog-eem.72. Fixed Montgommery code in LargeIntegers plugin. Fixed Newspeak callbacks. ------------------------------- Jun 2 2011 OSCog binaries as per VMMaker.oscog-eem.70. Add NewspeakInterpreter (Newspeak Virtual Machine.app.tgz & nsvm.zip). N.B. There are known limitations to at least the Mac VM: The VM does not move its window to the primsry monitor when on a secondary monitor that is closed. The VM will set the icons of changes and image files to the Squeak icon. Add InterpreterPrimitives>>signalNoResume: for callback serialization. LargeIntegers speedups (internal cleanups due to generating better C bit shift code - not better LargeIntegers bit shift code). Fix potential bug in Cogit's primitiveFlushCacheByMethod (must flush at cache). Interpreter speedups: faster access to primFailCode. Streamline booleanCheatTrue & booleanCheatFalse Fix arg count bug in BalloonEnginePlugin>>primitiveAddPolygon. ------------------------------- Apr 26 2011 OSCogVM as per VMMaker.oscog-eem.56. - Implement shallowCopy (primitiveClone) andcopyFrom: (primitiveCopyObject) correctly for contexts. - Change package name tocorrect Monticello branch convention. - Use trunk linux LocalePlugin source. Hopefully this fixes the LocalePlugin on linux. (snafu fix for primitiveClone w.r.t. VMMaker.oscog-eem.55). ------------------------------- Apr 26 2011 OSCogVM as per VMMaker.oscog-eem.55. - Implement shallowCopy (primitiveClone) andcopyFrom: (primitiveCopyObject) correctly for contexts. - Change package name tocorrect Monticello branch convention. - Use trunk linux LocalePlugin source. Hopefully this fixes the LocalePlugin on linux. ------------------------------- Apr 1 2011 OSCogVM source as per VMMaker-oscog.54. - Fix FFI by fixing includesbehaviorThatOf so classes no longer inherit from nil - Adopt the squeakvm HostWindowPlugin and SoundPlugin. - Including support for weak finalizers. N.B. This implies the WIndows plugin has a built-in SoundPlugin now. ------------------------------- Mar 18 2011 OSCogVM StackToRegisterMappingCogit binaries as per VMMaker-oscog.51/r2370 Implement cannotInterpret: hook in the JIT. Refactor JIT MNU handling so that lookupMethodNoMNUEtcInClass: can answer the splObj: index for the selector to use to lookup failure (doesNotUnderstand: or cannotInterpret:). StackToRegisterMappingCogit: Fix bug with storeRemoteTemp (e.g. StarMorph initialize in Cuis 3.x) which causes a stack off-by-one. Rename merge:afterReturn: to merge:afterContinuation: and revamp merge code to treat byte ode following an unconditional branch the same as after a return. Add spill tracing. Minor cleanups for recent blog posts: Refactor ceSend:super:to:numArgs: pulling abort support out to ceSendAbort:to:numArgs: Rename activateInterpreterMethodFromMachineCode to interpretMethodFromMachineCode Remove unused linkSends: flag. Move the genExternalizePointersForPrimitiveCall and genLoadCStackPointersForPrimCall to the front of compileInterpreterPrimitive so that a) reg args are pushed before any calls, and b) ceCheckProfileTick is run on the C stack. The first fixes the StackToRegisterBasedCogit's crash when profiling. The second is safer than running it on a Smalltalk stack page. Make printProcsOnList: & printProcessStack: part of the VM api for debug printing. Fix compileOpenPICPrototype openPICSize bug. Need to use a real selector to compute the right map size. ------------------------------- Mar 18 2011 OSCogVM StackToRegisterMappingCogit binaries as per VMMaker-oscog.47/r2361 Again revert the optimization level of the cointerpreter on linux to -O1. Move the genExternalizePointersForPrimitiveCall and genLoadCStackPointersForPrimCall to the front of compileInterpreterPrimitive so that a) reg args are pushed before any calls, and b) ceCheckProfileTick is run on the C stack. The first fixes the StackToRegisterBasedCogit's crash when profiling. The second is safer than running it on a Smalltalk stack page. Fix simulation of the profling machinery (to debug the above). Fix mapping for backward branches which must of course map to themselves, otherwise the VM can break out of loops prematurely. Bring StackInterpreter generation up to date (less labelling, vmCallback). zero instuctions on recompiling block after numInitialNils mis-estimation. More methods for the in-image facade for compiling quick prims. Both Cogits Fix pc mapping once and for all. Tests allow compiling all methods in current image and testing all mapped bcpcs and mcpcs map and map ack correctly. Fix bad bug with jump fixups that caused some fixups to be missed (error in the index used to define the range of valid fixup adresses). StackToRegisterMappingCogit Fix bug in addBlockStartAt:numArgs:numCopied:span: that caused one block entry to be omitted when some block was recompiled due to initialNil misestimation. As part of pc mapping fixing add an annotateUse flag to sim stack entries so that the eliminated send in a folded constant still gets a pc map entry. CoInterpreter Fix bug in checkLogIntegrity for an empty log. Safety in activation printing (for backtraces etc). Simulator Run quitBlock on closing simulator window. StackToRegisterMappingCogit: Fix bug in repeated block compilation for initial nil handling. Repeated attempts to insert the same block start are filtered out instead of repeating all block inserts (which doesn't work). Fix bad bug in frameless block entry. Don't use initSimStackForFramelessMethod:! Add a target fixup for a conditional jump even if it is a jump on true or false since the simStack must still be valid for merges. Nuke as-yet-unused merge state in CogSSBytecodeFixup. Add a subclass using image facilities to compute numnitialNils correctly to compare against repeated block compilation. CoInterpreter: comment an apparently unsent method to stop me from deleting it. Fix StackToRegisterMappingCogit being confused by initial pushNils in blocks that provide parameters rather than initialize temps by tracking stack depth and recompiling if the depth is wrong at the end of the block. Do this by refactoring compileMethodBody into compileEntireMethod et al. Fix atCache leakage for super sends in machine code by assigning to lkupClass (since commonAt:[put:] use lkupClass to filter-out super sends). Simulator fixes and tidyups. ------------------------------- Mar 18 2011 OSCogVM SimpleStackBasedCogit binaries as per VMMaker-oscog.47/rXXXX. Move the genExternalizePointersForPrimitiveCall and genLoadCStackPointersForPrimCall to the front of compileInterpreterPrimitive so that a) reg args are pushed before any calls, and b) ceCheckProfileTick is run on the C stack. The first fixes the StackToRegisterBasedCogit's crash when profiling. The second is safer than running it on a Smalltalk stack page. Fix simulation of the profling machinery (to debug the above). Fix mapping for backward branches which must of course map to themselves, otherwise the VM can break out of loops prematurely. Bring StackInterpreter generation up to date (less labelling, vmCallback). More methods for the in-image facade for compiling quick prims. Both Cogits Fix pc mapping once and for all. Tests allow compiling all methods in current image and testing all mapped bcpcs and mcpcs map and map ack correctly. Fix bad bug with jump fixups that caused some fixups to be missed (error in the index used to define the range of valid fixup adresses). CoInterpreter Fix bug in checkLogIntegrity for an empty log. Safety in activation printing (for backtraces etc). SImulator Run quitBlock on closing simulator window. CoInterpreter: comment an apparently unsent method to stop me from deleting it. Fix StackToRegisterMappingCogit being confused by initial pushNils in blocks that provide parameters rather than initialize temps by tracking stack depth and recompiling if the depth is wrong at the end of the block. Do this by refactoring compileMethodBody into compileEntireMethod et al. Fix atCache leakage for super sends in machine code by assigning to lkupClass (since commonAt:[put:] use lkupClass to filter-out super sends). Simulator fixes and tidyups.