<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	>
<channel>
	<title>Comments on: Cog</title>
	<atom:link href="http://www.mirandabanda.org/cogblog/2008/06/06/cog/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.mirandabanda.org/cogblog/2008/06/06/cog/</link>
	<description>Speeding Up Croquet and Squeak with a new open-source VM from Qwaq</description>
	<pubDate>Wed, 07 Jan 2009 03:55:25 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.5.1</generator>
		<item>
		<title>By: Paolo Bonzini</title>
		<link>http://www.mirandabanda.org/cogblog/2008/06/06/cog/#comment-101</link>
		<dc:creator>Paolo Bonzini</dc:creator>
		<pubDate>Mon, 14 Jul 2008 09:07:31 +0000</pubDate>
		<guid isPermaLink="false">http://cogblog.mirandabanda.org/?p=3#comment-101</guid>
		<description>@richie: one of the nice things in a dynamically-typed language like Smalltalk, without any primitive typess, is that a bytecode verifier becomes a very simple thing.  GNU Smalltalk has one, it's around 700 lines of C code (compared to 3200 lines of C++ for the Java verifier).

A verifier would catch the cases you proposed here.</description>
		<content:encoded><![CDATA[<p>@richie: one of the nice things in a dynamically-typed language like Smalltalk, without any primitive typess, is that a bytecode verifier becomes a very simple thing.  GNU Smalltalk has one, it&#8217;s around 700 lines of C code (compared to 3200 lines of C++ for the Java verifier).</p>
<p>A verifier would catch the cases you proposed here.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Paolo Bonzini</title>
		<link>http://www.mirandabanda.org/cogblog/2008/06/06/cog/#comment-100</link>
		<dc:creator>Paolo Bonzini</dc:creator>
		<pubDate>Mon, 14 Jul 2008 09:00:05 +0000</pubDate>
		<guid isPermaLink="false">http://cogblog.mirandabanda.org/?p=3#comment-100</guid>
		<description>As you said, speculatively inlined methods generate larger basic blocks.  However, even simpler JITs (e.g. using the code generation techniques in Ian Piumarta's Ph.D. thesis) do generate large *superblocks*.  I wonder if LLVM has a pass like GCC's tracer, which produces large basic blocks from large superblocks...</description>
		<content:encoded><![CDATA[<p>As you said, speculatively inlined methods generate larger basic blocks.  However, even simpler JITs (e.g. using the code generation techniques in Ian Piumarta&#8217;s Ph.D. thesis) do generate large *superblocks*.  I wonder if LLVM has a pass like GCC&#8217;s tracer, which produces large basic blocks from large superblocks&#8230;</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: richie</title>
		<link>http://www.mirandabanda.org/cogblog/2008/06/06/cog/#comment-45</link>
		<dc:creator>richie</dc:creator>
		<pubDate>Sun, 29 Jun 2008 20:23:17 +0000</pubDate>
		<guid isPermaLink="false">http://cogblog.mirandabanda.org/?p=3#comment-45</guid>
		<description>An excelent paper on JIT nativizers security flaws by the LSD team can be found in http://www.blackhat.com/presentations/bh-asia-02/LSD/bh-asia-02-lsd-article.pdf. Although it's old (2002), it's still current.</description>
		<content:encoded><![CDATA[<p>An excelent paper on JIT nativizers security flaws by the LSD team can be found in <a href="http://www.blackhat.com/presentations/bh-asia-02/LSD/bh-asia-02-lsd-article.pdf" rel="nofollow">http://www.blackhat.com/presentations/bh-asia-02/LSD/bh-asia-02-lsd-article.pdf</a>. Although it&#8217;s old (2002), it&#8217;s still current.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: richie</title>
		<link>http://www.mirandabanda.org/cogblog/2008/06/06/cog/#comment-44</link>
		<dc:creator>richie</dc:creator>
		<pubDate>Sun, 29 Jun 2008 18:59:21 +0000</pubDate>
		<guid isPermaLink="false">http://cogblog.mirandabanda.org/?p=3#comment-44</guid>
		<description>On VisualSmalltalk and Contexts:

I don't really know the details. Apparently there is no thisContext in VisualSmalltalk, but you can somehow access the contexts stack using Process, although it's not the same.

However, they do have something like what you describe to cope with BlockClosures:

The prologue of a nativized method somehow depends on the Blocks in the method, and there are some cases where the prologue includes an Array creation, and in this cases, locals and temps are accessed with a different set of Bytecodes: LoadContextTemporary/PushContextTemporary/StoreContextTemporary

This newly created Array referencing to context slots is somehow called "environment temporaries". You can see the logic of how it's used (at compile time) in ScriptNode &#62;&#62; #rebindTemporaries, at some point it reads:

		info := tempScope bindingFor: t value ifNone: [].
		info isExternallyReferenced
			ifTrue: [
				t
					binding: (self environmentTemporaryBindingClass new
						name: t value
						position: self incEnvTemporaryCount).
				t binding markReferenced]
			ifFalse: [
				t
					binding: (self stackTemporaryBindingClass new
						name: t value
						position: self incStackTemporaryCount)].

(how do you do nicely formatted posts on this blog?)

Most of this is guess work, as you can imagine.

On the security issues:

Using NoFramePrologue was just an example. I'll give you another example now, but the key point is: Security problems can be all over the place, from design down to implementation. Security in VMs is usually dealt with through sandboxing of some sort, but when it comes to a JIT, there are more problems: basically, if an attacker can control VM Bytecodes, he can somehow control what native code is emitted, and, unless specific validation is made, it's quite possible the attacker can escape the VM.

So, another example, from VisualSmalltalk, would be:

(Object &#62;&#62; #methodDictionaryArray:) byteCodeArray: #[16r50 16r9D 0 16r48]

Bytecodes:

16r50 LoadArgument1
16r9D StoreInstanceN 0
16r48 Return

You can use this method to change the methodDictionaryArray of any object. It works because instance variables are indexes as 1-based, hence StoreInstanceN 0, stores at instance variable 0, which is just before the first instance variable, i.e. the methodDictionaryArray of the Object.

In VisualSmalltalk there's already a primitive (97) to do this (which could be protected through some sort of Sandboxing). However, this not only breaks the integrity of objects, it's even worst (tested):

16r12345678 methodDictionaryArray: 'hola amigos!'

will write a pointer to the string 'hola amigos' at absolute memory position (16r12345678*2+1) (*2+1 is to convert from SmallInteger to native representation of SmallInteger, marking it with the lowest bit in 1, similar to VisualWorks, but I think VW uses 2 bits instead of 1).

So, here's another example. I could probably find more, but it'll still be for VisualSmalltalk. If you are interested, I could take a look at anything else, but it'll probably take me some time to find something.

Sorry for the long post :(</description>
		<content:encoded><![CDATA[<p>On VisualSmalltalk and Contexts:</p>
<p>I don&#8217;t really know the details. Apparently there is no thisContext in VisualSmalltalk, but you can somehow access the contexts stack using Process, although it&#8217;s not the same.</p>
<p>However, they do have something like what you describe to cope with BlockClosures:</p>
<p>The prologue of a nativized method somehow depends on the Blocks in the method, and there are some cases where the prologue includes an Array creation, and in this cases, locals and temps are accessed with a different set of Bytecodes: LoadContextTemporary/PushContextTemporary/StoreContextTemporary</p>
<p>This newly created Array referencing to context slots is somehow called &#8220;environment temporaries&#8221;. You can see the logic of how it&#8217;s used (at compile time) in ScriptNode &gt;&gt; #rebindTemporaries, at some point it reads:</p>
<p>		info := tempScope bindingFor: t value ifNone: [].<br />
		info isExternallyReferenced<br />
			ifTrue: [<br />
				t<br />
					binding: (self environmentTemporaryBindingClass new<br />
						name: t value<br />
						position: self incEnvTemporaryCount).<br />
				t binding markReferenced]<br />
			ifFalse: [<br />
				t<br />
					binding: (self stackTemporaryBindingClass new<br />
						name: t value<br />
						position: self incStackTemporaryCount)].</p>
<p>(how do you do nicely formatted posts on this blog?)</p>
<p>Most of this is guess work, as you can imagine.</p>
<p>On the security issues:</p>
<p>Using NoFramePrologue was just an example. I&#8217;ll give you another example now, but the key point is: Security problems can be all over the place, from design down to implementation. Security in VMs is usually dealt with through sandboxing of some sort, but when it comes to a JIT, there are more problems: basically, if an attacker can control VM Bytecodes, he can somehow control what native code is emitted, and, unless specific validation is made, it&#8217;s quite possible the attacker can escape the VM.</p>
<p>So, another example, from VisualSmalltalk, would be:</p>
<p>(Object &gt;&gt; #methodDictionaryArray:) byteCodeArray: #[16r50 16r9D 0 16r48]</p>
<p>Bytecodes:</p>
<p>16r50 LoadArgument1<br />
16r9D StoreInstanceN 0<br />
16r48 Return</p>
<p>You can use this method to change the methodDictionaryArray of any object. It works because instance variables are indexes as 1-based, hence StoreInstanceN 0, stores at instance variable 0, which is just before the first instance variable, i.e. the methodDictionaryArray of the Object.</p>
<p>In VisualSmalltalk there&#8217;s already a primitive (97) to do this (which could be protected through some sort of Sandboxing). However, this not only breaks the integrity of objects, it&#8217;s even worst (tested):</p>
<p>16r12345678 methodDictionaryArray: &#8216;hola amigos!&#8217;</p>
<p>will write a pointer to the string &#8216;hola amigos&#8217; at absolute memory position (16r12345678*2+1) (*2+1 is to convert from SmallInteger to native representation of SmallInteger, marking it with the lowest bit in 1, similar to VisualWorks, but I think VW uses 2 bits instead of 1).</p>
<p>So, here&#8217;s another example. I could probably find more, but it&#8217;ll still be for VisualSmalltalk. If you are interested, I could take a look at anything else, but it&#8217;ll probably take me some time to find something.</p>
<p>Sorry for the long post <img src='http://www.mirandabanda.org/cogblog/wp-includes/images/smilies/icon_sad.gif' alt=':(' class='wp-smiley' /></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: richie</title>
		<link>http://www.mirandabanda.org/cogblog/2008/06/06/cog/#comment-42</link>
		<dc:creator>richie</dc:creator>
		<pubDate>Tue, 24 Jun 2008 13:13:54 +0000</pubDate>
		<guid isPermaLink="false">http://cogblog.mirandabanda.org/?p=3#comment-42</guid>
		<description>I'll lookup Gilad Bracha's and David Simmons' ideas on security. Do you know if there's any virtual machine (with JIT) implementing any of this ideas? I'm more familiar with Visual Smalltalks VM/JIT, but willing to take a look at any other.

&lt;em&gt;Yes.  I'm pretty sure David's S# and SmallScript VMs implement all his security mechanisms.  Contact him for details.  He's on &lt;a href="http://www.linkedin.com" rel="nofollow"&gt;LinkedIn&lt;/a&gt; for example.&lt;/em&gt;

I've only given a quick diagonal read to your paper on "Context Management in VisualWorks 5i", and it looks similar to what VisualSmalltalk does (not that I know all VS' details).

&lt;em&gt;Do you have a reference?  I was under the impression that what I'd done for the context proxy mechanism was original.  Does VS even have contexts?&lt;/em&gt;

One thing I can say is that in VisualSmalltalk is easy to demonstrate what I mean:

(Object &#62;&#62; #boom) byteCodeArray: #[2 19 72]

2   NoFrameProlog
19  PushSmallInteger0
72  Return

This will jump to native memory address 1 (SmallInteger 0). Manipulating this, an attacker can execute arbitrary native code, and that's game over.

&lt;em&gt;Right.  But that requires that NoFrameProlog is in the instruction set.  By bounds checking jumps at JIT-compile time and not providing instructions which allow one to jump to or return to arbitrary integers the bytecode can prevent executing arbitrary code.  For that one would have to use the FFI :)&lt;/em&gt;

This is an example of the kind of native attacks on JITs that are most dangerous.

As I said, I still need to read your paper (have it on my loved kindle now :), and the other's you referenced.</description>
		<content:encoded><![CDATA[<p>I&#8217;ll lookup Gilad Bracha&#8217;s and David Simmons&#8217; ideas on security. Do you know if there&#8217;s any virtual machine (with JIT) implementing any of this ideas? I&#8217;m more familiar with Visual Smalltalks VM/JIT, but willing to take a look at any other.</p>
<p><em>Yes.  I&#8217;m pretty sure David&#8217;s S# and SmallScript VMs implement all his security mechanisms.  Contact him for details.  He&#8217;s on <a href="http://www.linkedin.com" rel="nofollow">LinkedIn</a> for example.</em></p>
<p>I&#8217;ve only given a quick diagonal read to your paper on &#8220;Context Management in VisualWorks 5i&#8221;, and it looks similar to what VisualSmalltalk does (not that I know all VS&#8217; details).</p>
<p><em>Do you have a reference?  I was under the impression that what I&#8217;d done for the context proxy mechanism was original.  Does VS even have contexts?</em></p>
<p>One thing I can say is that in VisualSmalltalk is easy to demonstrate what I mean:</p>
<p>(Object &gt;&gt; #boom) byteCodeArray: #[2 19 72]</p>
<p>2   NoFrameProlog<br />
19  PushSmallInteger0<br />
72  Return</p>
<p>This will jump to native memory address 1 (SmallInteger 0). Manipulating this, an attacker can execute arbitrary native code, and that&#8217;s game over.</p>
<p><em>Right.  But that requires that NoFrameProlog is in the instruction set.  By bounds checking jumps at JIT-compile time and not providing instructions which allow one to jump to or return to arbitrary integers the bytecode can prevent executing arbitrary code.  For that one would have to use the FFI <img src='http://www.mirandabanda.org/cogblog/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </em></p>
<p>This is an example of the kind of native attacks on JITs that are most dangerous.</p>
<p>As I said, I still need to read your paper (have it on my loved kindle now :), and the other&#8217;s you referenced.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: richie</title>
		<link>http://www.mirandabanda.org/cogblog/2008/06/06/cog/#comment-37</link>
		<dc:creator>richie</dc:creator>
		<pubDate>Fri, 20 Jun 2008 16:14:20 +0000</pubDate>
		<guid isPermaLink="false">http://cogblog.mirandabanda.org/?p=3#comment-37</guid>
		<description>Eliot, this is great!!!

I wish I could be doing it too :)

&lt;em&gt;Hi Richie,

    I certainly hope there will be room for collaborators.  I have to get my act together with a code repository and a bootstrap for a non-Qwaq image.  Once this is done others will be able to get at the code.&lt;/em&gt;


Anyway, two comments:

When you say you are going to stick to a stack interface, I guess this is not 100% strict, and you are willing to go for tradeoffs such as passing receiver (and maibe first argument) in registers (eax and edx maybe?). Answers also in registers (maybe eax), and holding a reference to self in some other register (esi sounds good)? I have not done any performance test, but I think we don't need to do them to demonstrate this gives a great performance boost, and IMHO it doesn't break the stack abstraction too much (you only need to think that the top of the stack is held in eax :)

&lt;em&gt;OK, I'll go into this in more detail in another post but this is a most important point.  The "interface" to the VM is still contexts.  Internally the VM uses a stack for efficiency but this is (almost) completely transparent to the Smalltalk programmer who still sees only contexts.  See &lt;a href="http://www.mirandabanda.org/files/oopsla99-contexts.pdf" rel="nofollow"&gt;this&lt;/a&gt; for a poorly written account of the details.

The calling convention will probably use Peter Deutsch's convention for HPS which passed the receiver and last two arguments in registers.  Using the last two makes the code generator simpler. This means that primitives like at: and at:put: don't even touch the stack.  The registers get pushed on method entry proper with a callee-saves convention.&lt;/em&gt;


On a different side, security is a huge issue, specially when it comes to JITs. For example, a very easy way to break out of the VM on a JIT is to unbalance the stack (when real and VM stacks are shared), if you push a constant and end the method, the return will just jump to the constant. This is a very clear case when arbitrary native code could be executed. Of course, if you are not thinking about security at all, and there's not going to be no sandboxing, maybe the same could be done just injecting bytecodes.

Bytecode verifiers are one way to constrain this type of attacks, but  they are not always enough.   

I have some experience in security, and I'm willing to follow up on this if you are interested.

&lt;em&gt;I'm very interested.  Please do.  Again much more details when I post on context-to-stack mapping, but I can say a little about the stack organization now.  The VM's Smalltalk stack will be housed on the C stack (created by alloca) but this is effectively sandboxed.  One can only run bytecoded methods on the Smalltalk stack, bytecoded methods have finite stack, and calls on the Smalltalk stack are bounds checked.  So I don't think its possible to violate security through bytecode.

I'm interested in Gilad Bracha's and David Simmons' ideas for security and would like Cog to be used to implement a secure language (perhaps Newspeak) as well as "vanilla" Smalltalk.  Gilad's security model in Newspeak is based on the bytecode set being secure by design, so that for example a private send bytecode has an implicit receiver so one can't create bytecode that pushes some arbitrary receiver and does a private send to it.  Since I want to make Cog easily configurable w.r.t. bytecode set I hope it'll be easy to use it for e.g. Newspeak.

Eliot&lt;/em&gt;</description>
		<content:encoded><![CDATA[<p>Eliot, this is great!!!</p>
<p>I wish I could be doing it too <img src='http://www.mirandabanda.org/cogblog/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p><em>Hi Richie,</p>
<p>    I certainly hope there will be room for collaborators.  I have to get my act together with a code repository and a bootstrap for a non-Qwaq image.  Once this is done others will be able to get at the code.</em></p>
<p>Anyway, two comments:</p>
<p>When you say you are going to stick to a stack interface, I guess this is not 100% strict, and you are willing to go for tradeoffs such as passing receiver (and maibe first argument) in registers (eax and edx maybe?). Answers also in registers (maybe eax), and holding a reference to self in some other register (esi sounds good)? I have not done any performance test, but I think we don&#8217;t need to do them to demonstrate this gives a great performance boost, and IMHO it doesn&#8217;t break the stack abstraction too much (you only need to think that the top of the stack is held in eax <img src='http://www.mirandabanda.org/cogblog/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p><em>OK, I&#8217;ll go into this in more detail in another post but this is a most important point.  The &#8220;interface&#8221; to the VM is still contexts.  Internally the VM uses a stack for efficiency but this is (almost) completely transparent to the Smalltalk programmer who still sees only contexts.  See <a href="http://www.mirandabanda.org/files/oopsla99-contexts.pdf" rel="nofollow">this</a> for a poorly written account of the details.</p>
<p>The calling convention will probably use Peter Deutsch&#8217;s convention for HPS which passed the receiver and last two arguments in registers.  Using the last two makes the code generator simpler. This means that primitives like at: and at:put: don&#8217;t even touch the stack.  The registers get pushed on method entry proper with a callee-saves convention.</em></p>
<p>On a different side, security is a huge issue, specially when it comes to JITs. For example, a very easy way to break out of the VM on a JIT is to unbalance the stack (when real and VM stacks are shared), if you push a constant and end the method, the return will just jump to the constant. This is a very clear case when arbitrary native code could be executed. Of course, if you are not thinking about security at all, and there&#8217;s not going to be no sandboxing, maybe the same could be done just injecting bytecodes.</p>
<p>Bytecode verifiers are one way to constrain this type of attacks, but  they are not always enough.   </p>
<p>I have some experience in security, and I&#8217;m willing to follow up on this if you are interested.</p>
<p><em>I&#8217;m very interested.  Please do.  Again much more details when I post on context-to-stack mapping, but I can say a little about the stack organization now.  The VM&#8217;s Smalltalk stack will be housed on the C stack (created by alloca) but this is effectively sandboxed.  One can only run bytecoded methods on the Smalltalk stack, bytecoded methods have finite stack, and calls on the Smalltalk stack are bounds checked.  So I don&#8217;t think its possible to violate security through bytecode.</p>
<p>I&#8217;m interested in Gilad Bracha&#8217;s and David Simmons&#8217; ideas for security and would like Cog to be used to implement a secure language (perhaps Newspeak) as well as &#8220;vanilla&#8221; Smalltalk.  Gilad&#8217;s security model in Newspeak is based on the bytecode set being secure by design, so that for example a private send bytecode has an implicit receiver so one can&#8217;t create bytecode that pushes some arbitrary receiver and does a private send to it.  Since I want to make Cog easily configurable w.r.t. bytecode set I hope it&#8217;ll be easy to use it for e.g. Newspeak.</p>
<p>Eliot</em></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: tim@rowledge.org</title>
		<link>http://www.mirandabanda.org/cogblog/2008/06/06/cog/#comment-34</link>
		<dc:creator>tim@rowledge.org</dc:creator>
		<pubDate>Wed, 18 Jun 2008 05:52:20 +0000</pubDate>
		<guid isPermaLink="false">http://cogblog.mirandabanda.org/?p=3#comment-34</guid>
		<description>Well, here we hit the complete inability of blog layout to support sensible back and forth debate.

I'm going to try to comment on Eliot's comments o my comments without getting us all lost. GPS doesn't work in here so good luck....

WRT CM format
&lt;i&gt;"I think this is a problem of not seeing the wood for the trees  The problem is not the format of CompiledMethod it is not being able to change it or subclass it"&lt;/i&gt;
Interesting point. Sure, let's make it possible to add instvars to byte objects and word objects. Let's add complexity to the classbuilder and debugger/inspector to support it. Is that actually simpler (remember you claimed to like TSTTCPW) than using the 'normal' class structures we already have? It would take some god arguments to convince me that it is so.

WRT 'in-vm' JIT or 'in-mage'
I'm pretty sure that this -
&lt;i&gt;"One important one is that of contexts. Contexts are a superb abstraction for activations, easing the implementation of processes, the debugger and so on. But they suck as a form for real execution"&lt;/i&gt;
does not actually logically imply this -
&lt;i&gt;"So contexts belong in the image and clever tricks to do without contexts most of the time belong in the VM. But that implies that any code to do with hiding contexts belongs in the VM which implies that the lower levels of the code generator are in the VM."&lt;/i&gt;
I don't see why an in-image translator shouldn't be allowed knowledge of the context handling, just as it is obviously allowed knowledge of the machine architecture. It doesn't impact the debugger any more than an in-vm translator since you wouldn't be debugging the generated code.
Further, one potentially very useful form of in-image translator would simply pass the relevant objects to a primitive (pretty much the same code as one might have in an in-vm translator) BUT it would allow the policy of what gets translated and when to be handled in the image. This might be of benefit in not 'wasting' time on translating code that is only run once, or it allow a choice between a quick-jit and a smart-jit at need. Or it might fail the prim completely on machines where there is no translator plugin yet provided or where the writable memory available on the machine is too small. 
Having a separate machine providing translated results does not necessarily mean going over a network connection - it could be another thread on the same machine. Perhaps that would be a way of getting an aggregate speed boost from many-core systems - and of course one could do that with an in-vm translator too.
On some restricted systems it might be that performance or memory limitations prevent the system from being able to run a good translator and passing the work across a network is the better option. And on some other forms of restricted system it might be that there is read-only memory available that could be used to store cached translated code to some benefit.

Just offering some different views from odd angles.

tim</description>
		<content:encoded><![CDATA[<p>Well, here we hit the complete inability of blog layout to support sensible back and forth debate.</p>
<p>I&#8217;m going to try to comment on Eliot&#8217;s comments o my comments without getting us all lost. GPS doesn&#8217;t work in here so good luck&#8230;.</p>
<p>WRT CM format<br />
<i>&#8220;I think this is a problem of not seeing the wood for the trees  The problem is not the format of CompiledMethod it is not being able to change it or subclass it&#8221;</i><br />
Interesting point. Sure, let&#8217;s make it possible to add instvars to byte objects and word objects. Let&#8217;s add complexity to the classbuilder and debugger/inspector to support it. Is that actually simpler (remember you claimed to like TSTTCPW) than using the &#8216;normal&#8217; class structures we already have? It would take some god arguments to convince me that it is so.</p>
<p>WRT &#8216;in-vm&#8217; JIT or &#8216;in-mage&#8217;<br />
I&#8217;m pretty sure that this -<br />
<i>&#8220;One important one is that of contexts. Contexts are a superb abstraction for activations, easing the implementation of processes, the debugger and so on. But they suck as a form for real execution&#8221;</i><br />
does not actually logically imply this -<br />
<i>&#8220;So contexts belong in the image and clever tricks to do without contexts most of the time belong in the VM. But that implies that any code to do with hiding contexts belongs in the VM which implies that the lower levels of the code generator are in the VM.&#8221;</i><br />
I don&#8217;t see why an in-image translator shouldn&#8217;t be allowed knowledge of the context handling, just as it is obviously allowed knowledge of the machine architecture. It doesn&#8217;t impact the debugger any more than an in-vm translator since you wouldn&#8217;t be debugging the generated code.<br />
Further, one potentially very useful form of in-image translator would simply pass the relevant objects to a primitive (pretty much the same code as one might have in an in-vm translator) BUT it would allow the policy of what gets translated and when to be handled in the image. This might be of benefit in not &#8216;wasting&#8217; time on translating code that is only run once, or it allow a choice between a quick-jit and a smart-jit at need. Or it might fail the prim completely on machines where there is no translator plugin yet provided or where the writable memory available on the machine is too small.<br />
Having a separate machine providing translated results does not necessarily mean going over a network connection - it could be another thread on the same machine. Perhaps that would be a way of getting an aggregate speed boost from many-core systems - and of course one could do that with an in-vm translator too.<br />
On some restricted systems it might be that performance or memory limitations prevent the system from being able to run a good translator and passing the work across a network is the better option. And on some other forms of restricted system it might be that there is read-only memory available that could be used to store cached translated code to some benefit.</p>
<p>Just offering some different views from odd angles.</p>
<p>tim</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: John M McIntosh</title>
		<link>http://www.mirandabanda.org/cogblog/2008/06/06/cog/#comment-32</link>
		<dc:creator>John M McIntosh</dc:creator>
		<pubDate>Wed, 18 Jun 2008 00:49:41 +0000</pubDate>
		<guid isPermaLink="false">http://cogblog.mirandabanda.org/?p=3#comment-32</guid>
		<description>&#62;A local server image, or a remote one, or one that caches and can return almost immediately, or even one that submits it to Amazon’s mechanical turk could be tried.

Oooh. Mmm I wonder caching at the central server?  I wonder if you send hashs of the method if it could send back the cached compiled code, otherwise you send the code and get back compiled version. Obviously you can decide between using a common pool, or an agreed upon encryption for private usage. 

Then for base squeak you could pre-load cached compiled code at startup time since global usage dictates a certain pattern

&lt;em&gt;There is *no way* in which copying generated code over the network is going to beat generating fast code in a VM on the fly.  No *way*.  Have you looked at start-up times for VisualWorks?  Note that even though the images Isaac Gouy runs on his shootout site take about 0.2 seconds to start-up that that time is dominated by walking the heap doing things like voiding old graphics handles, not in generating code.  At least last time I looked compile time wasn't the dominating factor.  JITs can be very fast.  Peter Deutsch's figures were always around 20 machine instructions executed to generate a machine instruction.  Even at one two or three orders of magnitude worse than that you're going to slaughter copying code around the network.  Fugedabahtid.

Eliot&lt;/em&gt;</description>
		<content:encoded><![CDATA[<p>&gt;A local server image, or a remote one, or one that caches and can return almost immediately, or even one that submits it to Amazon’s mechanical turk could be tried.</p>
<p>Oooh. Mmm I wonder caching at the central server?  I wonder if you send hashs of the method if it could send back the cached compiled code, otherwise you send the code and get back compiled version. Obviously you can decide between using a common pool, or an agreed upon encryption for private usage. </p>
<p>Then for base squeak you could pre-load cached compiled code at startup time since global usage dictates a certain pattern</p>
<p><em>There is *no way* in which copying generated code over the network is going to beat generating fast code in a VM on the fly.  No *way*.  Have you looked at start-up times for VisualWorks?  Note that even though the images Isaac Gouy runs on his shootout site take about 0.2 seconds to start-up that that time is dominated by walking the heap doing things like voiding old graphics handles, not in generating code.  At least last time I looked compile time wasn&#8217;t the dominating factor.  JITs can be very fast.  Peter Deutsch&#8217;s figures were always around 20 machine instructions executed to generate a machine instruction.  Even at one two or three orders of magnitude worse than that you&#8217;re going to slaughter copying code around the network.  Fugedabahtid.</p>
<p>Eliot</em></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: tim@rowledge.org</title>
		<link>http://www.mirandabanda.org/cogblog/2008/06/06/cog/#comment-31</link>
		<dc:creator>tim@rowledge.org</dc:creator>
		<pubDate>Tue, 17 Jun 2008 20:15:13 +0000</pubDate>
		<guid isPermaLink="false">http://cogblog.mirandabanda.org/?p=3#comment-31</guid>
		<description>Hi there-
so nice to see that you've got a chance to *actually* improve the squeak vm. After a decade of attempts to get something moving I was about to just give up and go do something else. Now perhaps it's worth hanging around to see if I can help or heckle to some effect.

I'd like to urge that you dump the stupid old compiled method format. After all, if you're anticipating changing the definition of some bytecode and adding closures properly (at last!) tthen the image really isn't going to be compatible with older vms. Besides, the current CM is total crap. It's a bad idea from top to bottom and the hacks added in recent times, such as the properties array forced into the literals which are in turn etc etc just make it worse and worse. Clean it out! Traits broke the format as well, so far as the VM is concerned. So please, lets' fix it properly and make the GC code simpler as well. Your great grandchildren will thank you.

Many years ago, Ian P was very excited about the performance enhancing possibilities of a cleaner CM design, claiming that having the bytecodes and literals as separate arrays would drastically simplify a translator. I'm not entirely sure I understood his arguments and of course they never had any practical value because he never finished anything that we could look at.

I think something like
Object subclass: #DecentCompiledMethod
       instanceVariableNames: 'header bytecodes literals properties'
       classVariableNames: '  '
       poolDictionaries: ''
       category: 'Kernel-Methods'
would be nearer the mark. I mean, just look at the current implementation of #methodClass. Or that old crap of fileIndex etc. Blech. Kill it.

&lt;em&gt;I think this is a problem of not seeing the wood for the trees :)  The problem is not the format of CompiledMethod it is not being able to change it or subclass it.  But these limitations, not being able to add inst vars to CompiledMethod and not being able to subclass CompiledMethod are restrictions that only apply in the 16-bit BlueBook VM.  The current Squeak VM is more than cabable of allowing CompiledMethod to be subclassed and to add inst vars to CompiledMethod and subclasses.  It'll take a bit of work in the ClassBuilder but no work in the VM.  I have a post prepared on this but it is in the queue behind the next Closures post which is itself held up by my not being able (yet) to post the code.  I don't want the posts to get too bogged down in detail so they're not the avenue through which to publish code (no one's going to copy/paste code out of a blog post anyway!).  Perhaps I should just post the message out of sequence.&lt;/em&gt;

Were you anticipating doing the translation under the covers within the vm as hps, or within the image? Or some other clever hack? I quite like the in-image idea as a general concept since it allows for a lot of flexibility. For example Bryce has had quite a lot of success writing a compiler that runs in the background, which is the 'obvious' approach. I'd like to see if having a remote compiler machine that you can send the method details to and get back a finished object might work. A local server image, or a remote one, or one that caches and can return almost immediately, or even one that submits it to Amazon's mechanical turk could be tried. Another option would be to implement the translation as whatever needed mix of image code and plugin primitives; that would make it very easy to have a flexible system configurable at run time.

&lt;em&gt;I like keeping as much stuff in the image as possible.  But there are abstraction boundaries to maintain to keep a clean design.  One important one is that of contexts.  Contexts are a superb abstraction for activations, easing the implementation of processes, the debugger and so on.  But they suck as a form for real execution.  So contexts belong in the image and clever tricks to do without contexts most of the time belong in the VM.  But that implies that any code to do with hiding contexts belongs in the VM which implies that the lower levels of the code generator are in the VM.

I favour an architecture that has an adaptive optimiser/speculative inliner up in the image which analyses bytecoded methods and execution state through contexts and which creates new bytecoded methods which inline others.  In this architecture the optimizer targets portable bytecode and the VM's code generator still has the responsibility of converting this to a particular machine code and of mapping back from stack frames and machine code PCs to contexts and bytecode PCs.

I also don't believe that code generation is so slow that one would ever benefit from cacheing generated code.  In fact I think that would slow things down.  Note that the in-image adaptive ptimizer has no problem cacheing optimized methods up in the image, and that generaign efficient code from longer optimized methods in the VM is still something that could happen relatively quickly given a cheap and dirty register allocator.  My hunch is that one might be a factor of two slower than a slower aggressive optimizer but one wouldn't be so far away that you'd throw up your hands in disgust and go back to a static language.  One only has to do significantly better than the state-of-the-art in dynamic language VMs to add value.  One doesn't have to beat fortran.

Cheers!
Eliot&lt;/em&gt;

tim
--
tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim
Useful random insult:- One chicken short of a henhouse.</description>
		<content:encoded><![CDATA[<p>Hi there-<br />
so nice to see that you&#8217;ve got a chance to *actually* improve the squeak vm. After a decade of attempts to get something moving I was about to just give up and go do something else. Now perhaps it&#8217;s worth hanging around to see if I can help or heckle to some effect.</p>
<p>I&#8217;d like to urge that you dump the stupid old compiled method format. After all, if you&#8217;re anticipating changing the definition of some bytecode and adding closures properly (at last!) tthen the image really isn&#8217;t going to be compatible with older vms. Besides, the current CM is total crap. It&#8217;s a bad idea from top to bottom and the hacks added in recent times, such as the properties array forced into the literals which are in turn etc etc just make it worse and worse. Clean it out! Traits broke the format as well, so far as the VM is concerned. So please, lets&#8217; fix it properly and make the GC code simpler as well. Your great grandchildren will thank you.</p>
<p>Many years ago, Ian P was very excited about the performance enhancing possibilities of a cleaner CM design, claiming that having the bytecodes and literals as separate arrays would drastically simplify a translator. I&#8217;m not entirely sure I understood his arguments and of course they never had any practical value because he never finished anything that we could look at.</p>
<p>I think something like<br />
Object subclass: #DecentCompiledMethod<br />
       instanceVariableNames: &#8216;header bytecodes literals properties&#8217;<br />
       classVariableNames: &#8216;  &#8216;<br />
       poolDictionaries: &#8221;<br />
       category: &#8216;Kernel-Methods&#8217;<br />
would be nearer the mark. I mean, just look at the current implementation of #methodClass. Or that old crap of fileIndex etc. Blech. Kill it.</p>
<p><em>I think this is a problem of not seeing the wood for the trees <img src='http://www.mirandabanda.org/cogblog/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  The problem is not the format of CompiledMethod it is not being able to change it or subclass it.  But these limitations, not being able to add inst vars to CompiledMethod and not being able to subclass CompiledMethod are restrictions that only apply in the 16-bit BlueBook VM.  The current Squeak VM is more than cabable of allowing CompiledMethod to be subclassed and to add inst vars to CompiledMethod and subclasses.  It&#8217;ll take a bit of work in the ClassBuilder but no work in the VM.  I have a post prepared on this but it is in the queue behind the next Closures post which is itself held up by my not being able (yet) to post the code.  I don&#8217;t want the posts to get too bogged down in detail so they&#8217;re not the avenue through which to publish code (no one&#8217;s going to copy/paste code out of a blog post anyway!).  Perhaps I should just post the message out of sequence.</em></p>
<p>Were you anticipating doing the translation under the covers within the vm as hps, or within the image? Or some other clever hack? I quite like the in-image idea as a general concept since it allows for a lot of flexibility. For example Bryce has had quite a lot of success writing a compiler that runs in the background, which is the &#8216;obvious&#8217; approach. I&#8217;d like to see if having a remote compiler machine that you can send the method details to and get back a finished object might work. A local server image, or a remote one, or one that caches and can return almost immediately, or even one that submits it to Amazon&#8217;s mechanical turk could be tried. Another option would be to implement the translation as whatever needed mix of image code and plugin primitives; that would make it very easy to have a flexible system configurable at run time.</p>
<p><em>I like keeping as much stuff in the image as possible.  But there are abstraction boundaries to maintain to keep a clean design.  One important one is that of contexts.  Contexts are a superb abstraction for activations, easing the implementation of processes, the debugger and so on.  But they suck as a form for real execution.  So contexts belong in the image and clever tricks to do without contexts most of the time belong in the VM.  But that implies that any code to do with hiding contexts belongs in the VM which implies that the lower levels of the code generator are in the VM.</p>
<p>I favour an architecture that has an adaptive optimiser/speculative inliner up in the image which analyses bytecoded methods and execution state through contexts and which creates new bytecoded methods which inline others.  In this architecture the optimizer targets portable bytecode and the VM&#8217;s code generator still has the responsibility of converting this to a particular machine code and of mapping back from stack frames and machine code PCs to contexts and bytecode PCs.</p>
<p>I also don&#8217;t believe that code generation is so slow that one would ever benefit from cacheing generated code.  In fact I think that would slow things down.  Note that the in-image adaptive ptimizer has no problem cacheing optimized methods up in the image, and that generaign efficient code from longer optimized methods in the VM is still something that could happen relatively quickly given a cheap and dirty register allocator.  My hunch is that one might be a factor of two slower than a slower aggressive optimizer but one wouldn&#8217;t be so far away that you&#8217;d throw up your hands in disgust and go back to a static language.  One only has to do significantly better than the state-of-the-art in dynamic language VMs to add value.  One doesn&#8217;t have to beat fortran.</p>
<p>Cheers!<br />
Eliot</em></p>
<p>tim<br />
&#8211;<br />
tim Rowledge; <a href="mailto:tim@rowledge.org">tim@rowledge.org</a>; <a href="http://www.rowledge.org/tim" rel="nofollow">http://www.rowledge.org/tim</a><br />
Useful random insult:- One chicken short of a henhouse.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jecel Assumpção Jr</title>
		<link>http://www.mirandabanda.org/cogblog/2008/06/06/cog/#comment-23</link>
		<dc:creator>Jecel Assumpção Jr</dc:creator>
		<pubDate>Fri, 13 Jun 2008 22:15:35 +0000</pubDate>
		<guid isPermaLink="false">http://cogblog.mirandabanda.org/?p=3#comment-23</guid>
		<description>Very cool project! I have designed a few bytecodes with a prefix instruction. Since these were meant for hardware implementation the literals were mixed with the instructions (duplicate literals are not as common in Self-like implementations as in Smalltalk-80) but perhaps some of the ideas might be useful to you.

&lt;em&gt;Thanks, Jecel.

yes, I did prefixes in VisualWorks along with Steve Dahl (he the bytecode compiler me the VM).  It works well.  With short-form and long-form bytecodes the prefixes don't have to apply to the short forms, only the long-forms.  So the overhead is reduced.  I'll be presenting a prefix-based bytecode set sometime, probably in a few weeks.

Eliot&lt;/em&gt;</description>
		<content:encoded><![CDATA[<p>Very cool project! I have designed a few bytecodes with a prefix instruction. Since these were meant for hardware implementation the literals were mixed with the instructions (duplicate literals are not as common in Self-like implementations as in Smalltalk-80) but perhaps some of the ideas might be useful to you.</p>
<p><em>Thanks, Jecel.</p>
<p>yes, I did prefixes in VisualWorks along with Steve Dahl (he the bytecode compiler me the VM).  It works well.  With short-form and long-form bytecodes the prefixes don&#8217;t have to apply to the short forms, only the long-forms.  So the overhead is reduced.  I&#8217;ll be presenting a prefix-based bytecode set sometime, probably in a few weeks.</p>
<p>Eliot</em></p>
]]></content:encoded>
	</item>
</channel>
</rss>
