Ravenex Ravenex

Niner since 2009


  • Inside SPUR - A Trace-Based JIT Compiler for CIL

    Cool, once again a managed implementation of JScript faster than the "current" version of IE's native one. When Managed JScript was still on going, it was faster than IE8's interpreter-based JScript.

    So SPUR doesn't yet support neseted trace trees. I wonder if they will consider the trace scheme used in LuaJIT 2 where it doesn't use trace trees, but still supports nested loops well. FYI http://lua-users.org/lists/lua-l/2009-11/msg00089.html

    The possibility of being able to trace all the way into SPUR itself is also very interesting. That would evolve into another metacircular VM. Hopefully it'll come true some day.

  • IE 9: First look at the new JS Engine

    What I'm expecting to see is that IE9 would implement ES5, as the result of "Harmony" of EcmaScript. If that becomes real, we'll be getting a few more modern JavaScript features.

  • Jimmy Schementi: Inside IronRuby

    Managed JScript is probably not gonna show up anymore. There seem to be discussion on the topic on DLR's CodePlex site, and the message is something like "it's lagging too far behind, there's nothing in there that's useful enough, except for the parser, but there are alternative JavaScript parsers implemented in .NET already, so even though the team thought about open sourcing it, they gave up". DLR has changed too much since the last drop we've got with Silverlight SDK (was it 0.3.0? I can't remember the exact version number), while IronPython and IronRuby were catching up to date with DLR, Managed JScript just stopped somewhere in the middle. Which is of course sad news to hear...

  • Ian Carmichael: The History and Future of the CLR

    Thanks for clarifying the misunderstanding, Charles Smiley

    Well...Walter Bright of Digital Mars doesn't seem to like the idea of having a VM/IL, either. And he designed the D language to have GC without a VM. The interview on CLR4 security changes mentioned you can do unsafe things with native code anyway, so VMs doesn't really buy you that last bit of security; that's OS' job.

    My take is that IL is just like what would have been an intermediate representation in between front-ends and back-ends of a conventional compiler; in other words, you're just splitting a compiler into two parts, with some of the machine independent parts packed into a "compiler from source to IL", and machine independent parts with machine dependent parts packed into the VM. Then of course you can interpret IL, but in systems like the CLR, the JIT compiler is what would have been a back end of a conventional compiler, with time-consumption taken into account -- time-costly optimizations can't be performed because it'd hurt startup perf.

    So deploying a managed app sounds like...distributing the sources (MSIL), and making sure a compiler (CLR) for the platform is installed. And when the user runs the managed app, CLR compiles the app and runs it. So for M source languages with N target platforms, you'd only have to write M compiler that target MSIL, and N CLR implementations for each platform, that's M+N instead of M*N in a traditional compiler architecture without a standard intermediate representation. JITting has this advantage of being able to dynamically linking libraries, and generating calls into the libraries with less indirection, because the JITter has knowledge of all methods' entry points; NGEN can't do that, so it's throughput is a bit worse than JITted code. That's why they had the troublesome hardbinding in NGEN before CLR4.

    And generally bytecodes are easier to verify than native code. That buys you verifiablity, for JVM and CLR, that's important. But bytecodes are actually "harder to verify" than source code, because source languages tend to have more restrictions than their corresponding bytecodes.

    Ah, and speaking of compilers...NGEN uses the JIT compilers as CLR does, but used in ahead-of-time compilation mode. NGEN is slow because it compiles everything, while JITting just compiles what's been invoked, so the compilation time is amortized and affordable. If you add up the total time to JIT everything used in a managed program, it'll probably take the same amount of time as NGEN does...

  • Ian Carmichael: The History and Future of the CLR

    Lars had pretty good experience in VMs, and he had worked on JVMs which are quite similar to CLR in general, I'm sure he's knows enough about bytecodes/IL. I wonder if Charles read my reply in the other thread (CLR4 debugging and profiling API). I was trying to say, when Charles started the topic on IL, Lars was probably still in the context of V8/JavaScript, so he said what he said, and it made perfect sense. Having bytecodes in V8 wouldn't have helped much in keeping the VM fast and simple. And that you'll have to "check twice" if you had bytecodes as a wire format for JavaScript. Charles was in the context of CLR or a more general VMs, the two guys really weren't arguing about the same thing...

    Going from source to bytecodes is much easier than getting straight to native code. A lot of JavaScript VMs are imlemented as interpreters, because interpreters are easier to build. So they would interpret bytecodes, and that's the only execution mode they have. V8 also has only one execution mode, and that's compiling straight to native code.

    But there are VMs, like TraceMonkey, that use adaptive compilation system, and use bytecode for startup and profiling/tracing, and hot spots get compiled/optimized into native code. That's having two execution modes (or more) in the same system, which leads to complications in design and implementation. Lars had worked on such adaptive compilation systems before, so he knows why it's not suitable for the goal of V8 -- build a fast and simple JavaScript VM in short amount of time.

  • CLR 4: Debugging and Profiling API Enhancements

    Hey Charles,

    I guess when you were arguing on IL with Lars at Lang.NET 2009, what Lars was saying about "not necessary to have IL", he wasn't talking generally on IL on all VMs, but rather focused on JavaScript: JavaScript has one and only one wire format, and that is the source code. By the spec, JavaScript programs are only defined with source code, and when you distribute JavaScript programs, you distribute in source form. Every compatible JavaScript implementation should spit out the source code of some function when you call toString() on the function object. So the ability to deal with source code is a must-have in every JavaScript implementation. Within the VM itself, it can use whatever internal representation of the program as it likes, and some uses bytecodes, just as MSIL does. What's different is that MSIL is the wire format for .NET programs, and those bytecodes in JavaScript VMs aren't.

    When you distribute managed programs, the CLR verifies MSIL to make sure it's valid (in some environment settings you're not allowed to run unverifiable code). This is important because otherwise no one can guarantee that the program in MSIL is what is was in C# or VB, or any other source form -- some bad guy could have just made up rouge programs directly in MSIL.

    If JavaScript had IL as a wire format, then any JavaScript VM will have to do the same verifications as the CLR does, to make sure that the IL is valid. And remember JavaScript already has source form as its wire format. So whenever a JavaScript receives source code to run, it has to parse the source code and do some checking before generating IL (check No.1), and then before executing the IL it checks again (check No.2). That what Lars was saying about "you'll have to check twice if you had IL (as a wire format for JavaScript)".

    So it's not that IL is bad in general sense, it just doesn't fit into the JavaScript model of wire format, and that's it.

    I read about this in a blog, here: http://rednaxelafx.javaeye.com/blog/382429, it actually gives an example of what happens if a language has IL as one of its wire formats, but doesn't do verification on the IL before executing (in that post the example is CPython). It's in Chinese, but maybe we could get him translate it into English sometime later...



  • Vance Morrison: CLR Through the Years

    Vance mentioned that there's little opportunity for the CLR itself to leverage multi-core/processors other than the GC. But what about the JITter? If there are enough processing-resources free, wouldn't it be nice to let some background thread collect profile information and feedback on the JITter, so that after code pitching happens the JITter could produce better code? Or put in other words, why not make the Execution Engine more adaptive? And how adaptive is it nowadays in CLR 4?