Tech Off Thread

14 posts

Forum Read Only

This forum has been made read only by the site admins. No new threads or comments can be added.

Standards question: Is sending XHTML as text/html considered harmful?

Back to Forum: Tech Off
  • User profile image
    Larsenal

    I had someone send me this link?  I'm not as much of a standards guy, so I'd appreciate it if you could comment on whether the concerns posted here are legit.

    Sending XHTML as text/html Considered Harmful


  • User profile image
    Sven Groot

    Larsenal wrote:
    I had someone send me this link?  I'm not as much of a standards guy, so I'd appreciate it if you could comment on whether the concerns posted here are legit.

    Sending XHTML as text/html Considered Harmful
    
    
    If you rigorously test and make sure you are in fact sending valid XHTML then you should mostly be safe.

    There are a few things that change though if a browser uses true XHTML mode, there are changes obviously to well-formedness constraints, but also to the object model, CSS rules, and more. Frequently, you'll find the following happens if you switch to application/xhtml+xml: the background of your page doesn't fill the viewport if the page content is smaller than the viewport, your scripts stop working (either because they are inside comments in the HTML page or use non-namespace aware DOM methods) and your <style> tags stop working (again, if their contents are inside comments).

    To get away from the comments problem it's best to put scripts and stylesheets in external files. The way to use them inline in XHTML is to use CDATA sections, but if you do plan on using text/html that won't work. Mozilla and Opera allow you to use both the namespace-aware and non-namespace-aware methods if in HTML mode (only namespace aware if in XHTML mode of course), but IE doesn't have the namespace-aware methods at all. I usually test for their existence, use them if they're there, use the regular version if not.

    Of course, in an ideal world you should never send XHTML as text/html, but always as application/xhtml+xml. In XHTML1.1 it's even against the spec to use text/html. However, with the world's no. 1 browser (IE) not accepting application/xhtml+xml you have no choice if you want to use XHTML at all.

  • User profile image
    Maurits

    If you write your XHTML carefully*, you should be OK.

    * By which I mean, considering IE 6 and other primarily-HTML user-agents

    For example, if you put a space before the / in self-closing tags, HTML user-agents will just think the / is an unrecognized non-valued attribute:

    <br /> -- OK, <br> with an unrecognized "/" attribute

    As opposed to thinking it's part of the tag name
    <br/> -- bad, unrecognized tag name

    There are other details mentioned in the XHTML spec under the HTML Compatibility Guidelines section

  • User profile image
    sbc

    Sven Groot wrote:
    Of course, in an ideal world you should never send XHTML as text/html, but always as application/xhtml+xml. In XHTML1.1 it's even against the spec to use text/html. However, with the world's no. 1 browser (IE) not accepting application/xhtml+xml you have no choice if you want to use XHTML at all.

    It makes things worse when Visual Studio 2005 has XHTML 1.1 as one of the validation options. Especially since IE does not support XHTML at all when served properly (i.e. not as text/html).

  • User profile image
    W3bbo

    Sven Groot wrote:
    Of course, in an ideal world you should never send XHTML as text/html, but always as application/xhtml+xml. In XHTML1.1 it's even against the spec to use text/html. However, with the world's no. 1 browser (IE) not accepting application/xhtml+xml you have no choice if you want to use XHTML at all.


    There's a workaround.

    This meta element:

    <meta http-equiv="Content-Type" content="application/xhtml+xml;charset:utf-8" />

    Essentially overrides the HTTP Content-Type header, allowing you to send XHTML1.1 documents as text/html.

    Well, in practice.

    On paper, http-equiv is meant to be read by the webserver and add those headers to the response, but they don't.

    GFF. Smiley

  • User profile image
    Sven Groot

    W3bbo wrote:
    There's a workaround.

    This meta element:

    <meta http-equiv="Content-Type" content="application/xhtml+xml;charset:utf-8" />

    It's not a real workaround. No browser I know of that supports application/xhtml+xml (Opera and Mozilla) will actually switch to XML mode when it encounters that. If the server sent text/html, those browsers will use HTML mode and do none of the XHTML-specific things I described above in spite of the http-equiv element.

  • User profile image
    sbc

    Sven Groot wrote:
    W3bbo wrote:There's a workaround.

    This meta element:

    <meta http-equiv="Content-Type" content="application/xhtml+xml;charset:utf-8" />

    It's not a real workaround. No browser I know of that supports application/xhtml+xml (Opera and Mozilla) will actually switch to XML mode when it encounters that. If the server sent text/html, those browsers will use HTML mode and do none of the XHTML-specific things I described above in spite of the http-equiv element.

    I don't see how it would work. All the web server does is send the content - it does not parse it or anything (that is done by the developer with PHP, ASP.NET etc). For a browser to parse it as xhtml it would have to parse it twice - first to check the content type (with the text/html parser), then to parse the content (With application/xhtml+xml parser).

  • User profile image
    ShadowChaser

    If you use ASP.NET you can write some code (perhaps in your template's codebehind file) that detects which mime-types the browser supports.

    AFAIK the supported mime-types are sent along with the HTTP request.

    I check for the xhtml mime-type in the request and send a response type of xhtml if it exists. Otherwise, I fall back to text/html.

    Best of both worlds - supports "legacy" browsers such as IE but you also get the full benefit of XHTML parsing on standards compliant platforms.

  • User profile image
    ShadowChaser

    sbc wrote:
    Sven Groot wrote:
    W3bbo wrote: There's a workaround.

    This meta element:

    <meta http-equiv="Content-Type" content="application/xhtml+xml;charset:utf-8" />

    It's not a real workaround. No browser I know of that supports application/xhtml+xml (Opera and Mozilla) will actually switch to XML mode when it encounters that. If the server sent text/html, those browsers will use HTML mode and do none of the XHTML-specific things I described above in spite of the http-equiv element.

    I don't see how it would work. All the web server does is send the content - it does not parse it or anything (that is done by the developer with PHP, ASP.NET etc). For a browser to parse it as xhtml it would have to parse it twice - first to check the content type (with the text/html parser), then to parse the content (With application/xhtml+xml parser).


    That's exactially what they do, for the most part. It checks the "HTTP Response" content type first. If that's set to text/html it falls back on the HTML renderer. The HTML renderer then checks the code and either renders it as pure HTML or determines whether the content is XHTML and switches back to that mode.

    Strictly speaking though, a browser shouldn't ever switch into XHTML mode if the content is sent as text/html by the http response. IE behaves like that, but I don't believe other browsers do. IE's XHTML support is just an incomplete "retrofit" over top of the HTML parser. It doesn't natively support it.

    Don't forget, if you want your XHTML to be parsed as HTML by downlevel browsers, you need to make some syntax changes. Ie/ write <br /> instead of <br/>

  • User profile image
    Sven Groot

    ShadowChaser wrote:
    That's exactially what they do, for the most part. It checks the "HTTP Response" content type first. If that's set to text/html it falls back on the HTML renderer. The HTML renderer then checks the code and either renders it as pure HTML or determines whether the content is XHTML and switches back to that mode.

    Uhm, no they don't. If the server sends text/html, they will never switch to XHTML mode, not based on the http-equiv, not based on the doctype, not based on the namespace, never. At least that's the case for FF 1.07 and older (didn't test 1.5 yet) and Opera.

    ShadowChaser wrote:
    Strictly speaking though, a browser shouldn't ever switch into XHTML mode if the content is sent as text/html by the http response. IE behaves like that, but I don't believe other browsers do. IE's XHTML support is just an incomplete "retrofit" over top of the HTML parser. It doesn't natively support it.

    Again wrong. There's no retrofit. IE6 (or IE7 for that matter) doesn't support XHTML at all. It can deal with it because the parser is forgiving enough of HTML syntax errors so that it doesn't choke on the changes made for XHTML. There is nothing in the IE parser that's specifically tailored towards XHTML. That's also why IE7 still won't accept application/xhtml+xml: they want to get XHTML support right, not do just a retrofit, and they don't have time for that in the IE7 timeframe, hence they won't support it yet.

  • User profile image
    sbc

    In answer to the question:
    Yes, if you want to use XHTML 1.1. Just don't use it at all. Use XHTML 1.0 Strict or XHTML 1.0 Transitional for now (as you can send it as text/html).

    Wait until IE support XHTML 1.1. Even then it would be many years before it is safe to do so (as they will still be many using IE6). 2010 would probably be when it is safe.


    That is why there is a big problem with VS 2005. It shouldn't have XHTML 1.1 at all. Since the content will be sent text/html regardless of the DOCTYPE. I can see big headaches in the future when it is finally supported by IE.

  • User profile image
    blowdart

    sbc wrote:



    That is why there is a big problem with VS 2005. It shouldn't have XHTML 1.1 at all. Since the content will be sent text/html regardless of the DOCTYPE. I can see big headaches in the future when it is finally supported by IE.


    You can of course tweak it; as a challenge I've been doing a site where it is XHTML 1.1 and will get sent as the correct MIME type to browsers who say they accept it. And text/html to IE.  

  • User profile image
    Sven Groot

    blowdart wrote:
    sbc wrote:


    That is why there is a big problem with VS 2005. It shouldn't have XHTML 1.1 at all. Since the content will be sent text/html regardless of the DOCTYPE. I can see big headaches in the future when it is finally supported by IE.


    You can of course tweak it; as a challenge I've been doing a site where it is XHTML 1.1 and will get sent as the correct MIME type to browsers who say they accept it. And text/html to IE.  

    I have held off doing that because of something that I consider a bug in Mozilla: it downloads all content referenced on a page that is application/xhtml+xml, even when it's not visible (for instance inside a <noscript> tag when scripting is enabled). This wreaks havoc on counters that use both a script and noscript image, causing them to count double.

    Also, Mozilla (not sure about Opera) cannot incrementally display a page sent as application/xhtml+xml, it has to wait until it's fully downloaded, so sending a page using that mime-type actually detracts from the user experience.

  • User profile image
    sbc

    blowdart wrote:
    sbc wrote:


    That is why there is a big problem with VS 2005. It shouldn't have XHTML 1.1 at all. Since the content will be sent text/html regardless of the DOCTYPE. I can see big headaches in the future when it is finally supported by IE.


    You can of course tweak it; as a challenge I've been doing a site where it is XHTML 1.1 and will get sent as the correct MIME type to browsers who say they accept it. And text/html to IE.  

    I still don't think that is a good idea. XHTML (the non transitional variations) should not be sent as text/html:
    http://www.w3.org/TR/xhtml-media-types/#summary
    It is far easier to use XHTML transitional - you get the xml syntax, plus backwards compatibility, which is important if you do a CMS and want people to style their content.

    I wonder if (proper, compliant) XHTML will ever take off? The more I think about it, I doubt it will (partly web developers fault, partly browser developers) - the lack of progressive rendering will discourage developers from sending it with the proper media type. At least with XHTML 1.x, but perhaps XHTML2 will as there is no backwards compatibility to worry about.

Conversation locked

This conversation has been locked by the site admins. No new comments can be made.