Coffeehouse Post

Single Post Permalink

View Thread: Sell your PowerPoint stocks
  • User profile image

    , Bass wrote

    Bytecodes don't necessarily translate to better performance: people overrate the amount of effort it takes a modern computer to parse a few kilobytes of text.

    And yet browser vendors resort to dirty tricks to get smaller benefits - like pre-emptive TCP/SSL-negotations and DNS-lookups precisely to get millisecond speedups at the start of the page. Connection:keep-alive and Content-Encoding: gzip/spdy/deflate is another example where minor speed benefits warranted major changes in the protocol - so I don't believe that "the parsing cost is negligible" argument is, or has ever been valid for the web.

    Secondly - have you ever tried to actually parse HTML? it's a really horrendous language to parse. The tokeniser takes O(n) on the text (including every comment and whitespace), and tokens are maybe five or six times longer than a bytecode equivilent - so tokenizing is right out-of-the-gate 5 times slower than a bytecode representation.

    Next, there's the huge cost is turning these bytes into a DOM. This is expensive because HTML is badly designed.

    The number of stupid rules in HTML parsing is one of the key reasons why sites don't work cross-browser for free. Rules like <b><p>foo<p>bar</b> turning into <b><p>foo</p><p>bar</p></b> and <b id="_1"><p><i id="_2">foo</b></i><p>bar</b> turning <p><b id="_1"><i id="_2">foo</i></b></p><p><b id="_1">bar</b></p> are cases in point.

    And the redundancy in HTML is appalling. If you want a red bit of text you can use <font color="red">, <span style="color:red">, <span class="redcolor">, <div style="display:inline-block;color:red">. Want to put in a quote? How about using <blockquote>, <q>, <span style="font-style:italic">, <div style="font-style:italic;display:inline-block"> or even <div class="quotestyle_class">

    A consequence of this is that browsers don't actually render the DOM. They convert it first into a pseduo-dom (mozilla calls it a flow-dom, microsoft calls it the render-tree) which is normalized to divs with styles - but because javascript interacts with the non-flow DOM, the browser spends loads of time after each javascript interaction having to regenerate the (now invalidated) flow-DOM.

    Javascript is another example of the web just frankly getting it wrong. Browsers are today squeezing performance out of javascript by inferring types (JITs always prefer strong types because then you can do an ADD instead of a CALL _javascriptThunkVariableAdd for 1+2). But if Javascript was strongly typed to start off with, you wouldn't need to infer the types in the first place - saving valuable milliseconds at the start of each page.

    JITs also have massive pressure to compile quickly - something that normal compilers don't have. This means that is a compiler optimisation gives you 2% speedup but 10seconds to perform, Javascript won't apply that optimisation. If you compiled it to bytecodes, you might.

    Bytecodes require special tools to generate, increasing the barrier of entry.

    Bytecodes can cause an overspamming of programming languages in the market, making web development more complicated and more to learn unless you want to limit your experience on one specific stack. Basically, your skills might not transfer. There are JS-as-bytecode type languages (Microsoft made one called TypeScript, and I personally use CoffeeScript), but don't try to be TOO different from JavaScript, probably because of the limitations of making JS a compile target.

    The argument that HTML makes writing for the web easier is a complete lie. When I write a Win32 program, I can choose to learn C# or C++ or VB, and that will take me comfortably all the way through to a pixel-perfect app that works on anyone's machine, be it Dell or Asus or whatever. It'll just work.

    But let's contrast that with the number of languages you need to learn to be good at the web. To write good Win32 programs, I need to learn C++ or C# or VB. To learn the web I need to learn (C# or VB or PHP or Ruby) and (HTML and CSS and Javascript) and (JSON or AJAX) and SQL.

    And even with all of that, it still won't work for free on Firefox and IE and Chrome! You need to do vastly more testing of a minor website than an equivalent Win32 program.

    And let's contrast the difference between if you get it wrong. If I screw up a C++ app, I might accidentally leave in a heap-buffer overflow. But DEP+ASLR+Heap cookies are likely to make exploitation of that really hugely hard. On the web, that buffer overflow won't be there, but you'll have SQL-injection, code-injection (via php include/eval etc) just littered about the app. You don't get security for free here because the unification of strings and code make it impossible to secure for free.

    Bytecodes are harder to inspect, negating one of the things that makes the web so practical and open (Right Click -> View Source). This feature is I think is how a lot of people learned web development to start with, myself included.

    Designing a language which goes out of its way to make programs written for it easy to plagurize is baffling. In my mind, it is a major weakness of the web, rather than one of its strengths.

    I learnt C# without much difficulty using online tutorials, books and just plain old fashioned trial and error. I didn't need to learn by viewing the source of other people's apps. I don't see why web developers are such a special case that they can't learn like a normal person - you know - but not stealing other people's IPR.

    Getting an agreement between browser markers on a bytecode will be impossible, especially given all the above. This is probably the most damning point.

    Getting an agreement between browser manufacturers about anything is impossible, so I reject that argument directly. You want a box-shadow? You better be prepared to put in six completely vendor specific CSS names to get it. How about a linear gradient? Again, you're going to need some -moz- and -webkit and -o prefxes there. The list goes on.


    My contention then is this: The web is not successful because of HTML and Javascript. It is successful despite them.

    If HTTP, HTML and Javascript had been properly designed, I'm under no doubt that the experience of the web would be better right now.