XHTML and microformats revisited
June 3rd, 2007
Since my previous post on microformats, I have decided that my opinion in this matter needs more evidence.
While I could collect all following information before writing the post, I didn’t have enough motivation to do the research.
But now, after writing it, I have my self-esteem as a motivation.
Ok, so I proposed using (namespaced) custom tags instead of overloading existing ones.
Now let’s go scientific and see what questions this solution may rise.
- Do modern browsers support CSS styling for unknown tags in HTML documents?
- Can these tags be added to document without breaking standard compliance (validity)?
- What possible problems can arise from using non-standard tags in modern browsers?
- …
For practical purposes, these can be converted into two main questions
- Should custom tags work?
- Do custom tags work in modern browsers?
And the answers are:
- By default, no.
- Not perfectly, but yes.
Now let’s discuss it in detail.
To understand the first answer is to understand what exactly is HTML, what is XML and what is XHTML.
The most important (maybe obvious) point is: HTML is not a subset of XML and HTML is not compatible with XML.
HTML and XML are both a subsets of SGML, and SGML does not provide a way to mix different subsets within a single document.
So custom XML tags are not allowed in a HTML document.
While there are some solutions that allow arbitrary XML to be placed in a HTML document.
For example, Microsoft has XML Data Islands.
But they can be considered grammar hacks due to XML-HTML incompatibility.
Practically, however, HTML documents have to be viewed as “tag soup” by the browsers, so custom tags do not cause document rendering to fail.
So, if I am formally out of luck with HTML, what about XHTML?
For simplicity, one can view XHTML is a rewrite of HTML to follow XML rules.
So any custom tags should be allowed in XHTML if they are properly namespaced.
But there are a lot of problems with authoring XHTML.
While some of them are more like challenges (script/style syntax), one is extremely important.
The only way to tell modern browsers that that the document is XHTML is to serve it as application/xhtml+xml
(See this document for an excellent explanation).
And Internet Explorer doesn’t support XHTML at all — so it refuses to render application/xhtml+xml.
(It doesn’t mean IE can’t open XHTML. When XHTML document is sent as text/html, IE renders it with HTML engine).
So I was out of luck once again.
At that point I understood the reasoning of microformats.
Standard compliance is an important part of better Web, and there is no completely valid way to use custom tags.
But what is with the second question? It seems that actual situation is way better than one could suppose.
Firefox, IE7 and Opera 9 all could render the custom tags style correctly in the document served as text/html.
(To be really pedantic, I set DTD and xmlns to XHTML.
After all, even if text/html documents are never parsed as XHTML, MIME Type is a server setting, not document one.)
But IE7 has a one important characteristic — it does not render custom tag styles unless there is an xmlns for their namespace on html tag.
No other tag is sufficient.
What does it mean? It means that while one can make a document that is styled correctly in these IE7,
document part containing custom tags can not be reused without providing a namespace on the aggregating document.
But it not an extremely important point, since for aggreagation one does not actually control styles as well.
So, practically speaking, one can create a document that uses custom XML tags for the cost of formal document validity.
(The document can still be made formally valid by using custom DTD, but this will put IE and FF into quirks mode).
By the way, the challenge of adding custom tags to HTML was faced by MathML (mathematical markup language) community for years.
If you are interested, you can read these discussions:
- Cannot render MathML — netscape.public.mozilla.mathml (2001)
- MathML-in-HTML5 — mozilla.dev.tech.mathml (2006)
Personally, I still see microformats as a step in wrong direction.
While hCard provides HTML with a way to express the vCard semantics, I would prefer it to be just a HTML-compatible way, not the recommended one.
I see HTML as standard that needs support, but not popularized extensions.