This is the first post in a two-post series on a typeswitch implementation in C#.
This one contains a problem statement and possible solutions in other languages.
The second one will contain a Switch.Type description and benchmarks.

Somewhat often I find myself writing code to do something based on runtime type of a value.
A classic case is to filter a tree with different types of nodes (expression tree, for example).

The code often looks like

if (x is A) {
    DoWithA(x as A);
}
else if (x is B) {
    DoWithB(x as B);
}
else
    // ...

Or, if you are a heavy performance freak like me it is like

A a = x as A;
if (a != null) {
    DoWithA(a);
    return;
}

B b = x as B;
if (b != null) {
    DoWithB(b);
    return;
}

// ...

After second type it really starts to smell.

I could have used a Visitor.
But I really dislike it due to the coupling between the Visitor interface and the underlying class hierarchy.
Also requires me to extend hierarchy with a zero value Accept method.

I can also use some kind of hashtable-based smart resolver, but it would be complex and slow.

Actually, that is not an obscure problem and a lot of other languages have their solutions.
There are two common ones:

  1. OO concept known as multiple dispatch.
    Multiple dispatch is just a bunch of “method overloads” resolved by runtime environment basing on the runtime argument types.
    This is quite different from ordinary method overloading — for example, in C# compiler picks an overloaded method during compilation.

    Actually, .Net has a way to do multiple dispatch through Reflection (Type.InvokeMethod), but it quite slow and not compiler-type-safe.

    There is a brilliant paper “Generalized Interfaces for Java” that gives some insight on useful multiple dispatch in Java/C#-like languages.
    Hopefully we’ll get that functionality in C# and CLR sooner or later.

  2. Functional language concept known as pattern matching.
    This is a kind of powerful switch/case statement (with a simplified syntax).
    I do not actually know much about functional languages, so that is my understanding.

The simplest possible construct (I do not want to dive into multiple dispatch) might look like this:

typeswitch (x) { 
   case (A a): DoWithA(a); break; 
   case (B b): DoWithB(b); break; 
   default: throw new ArgumentException(); 
}

And lcs Research C# Compiler has a similar syntax sample:

typeswitch (o) { 
    case Int32 (x): Console.WriteLine(x); break; 
    case Symbol (s): Symbols.Add(s); break; 
    case Segment: popSegment(); break; 
    default: throw new ArgumentException(); 
}

Cω language also had an actual typeswitch construct.
But I was not able to find out it’s syntax (my old VS.Net is somewhy ruined and web is silent on it).
Anyway, Andrey Titov (who I hope will also blog someday) reminded me that it compiled to zero IL (seems it was too experimental).

Stay tuned, next time we’ll see how it is possible to emulate typeswitch in C# 3.0.

Since my previous post on microformats, I have decided that my opinion in this matter needs more evidence.
While I could collect all following information before writing the post, I didn’t have enough motivation to do the research.
But now, after writing it, I have my self-esteem as a motivation.

Ok, so I proposed using (namespaced) custom tags instead of overloading existing ones.
Now let’s go scientific and see what questions this solution may rise.

  1. Do modern browsers support CSS styling for unknown tags in HTML documents?
  2. Can these tags be added to document without breaking standard compliance (validity)?
  3. What possible problems can arise from using non-standard tags in modern browsers?

For practical purposes, these can be converted into two main questions

  1. Should custom tags work?
  2. Do custom tags work in modern browsers?

And the answers are:

  1. By default, no.
  2. Not perfectly, but yes.

Now let’s discuss it in detail.

To understand the first answer is to understand what exactly is HTML, what is XML and what is XHTML.
The most important (maybe obvious) point is: HTML is not a subset of XML and HTML is not compatible with XML.
HTML and XML are both a subsets of SGML, and SGML does not provide a way to mix different subsets within a single document.
So custom XML tags are not allowed in a HTML document.

While there are some solutions that allow arbitrary XML to be placed in a HTML document.
For example, Microsoft has XML Data Islands.
But they can be considered grammar hacks due to XML-HTML incompatibility.

Practically, however, HTML documents have to be viewed as “tag soup” by the browsers, so custom tags do not cause document rendering to fail.

So, if I am formally out of luck with HTML, what about XHTML?
For simplicity, one can view XHTML is a rewrite of HTML to follow XML rules.
So any custom tags should be allowed in XHTML if they are properly namespaced.

But there are a lot of problems with authoring XHTML.
While some of them are more like challenges (script/style syntax), one is extremely important.
The only way to tell modern browsers that that the document is XHTML is to serve it as application/xhtml+xml
(See this document for an excellent explanation).
And Internet Explorer doesn’t support XHTML at all — so it refuses to render application/xhtml+xml.
(It doesn’t mean IE can’t open XHTML. When XHTML document is sent as text/html, IE renders it with HTML engine).
So I was out of luck once again.

At that point I understood the reasoning of microformats.
Standard compliance is an important part of better Web, and there is no completely valid way to use custom tags.

But what is with the second question? It seems that actual situation is way better than one could suppose.
Firefox, IE7 and Opera 9 all could render the custom tags style correctly in the document served as text/html.
(To be really pedantic, I set DTD and xmlns to XHTML.
After all, even if text/html documents are never parsed as XHTML, MIME Type is a server setting, not document one.)
But IE7 has a one important characteristic — it does not render custom tag styles unless there is an xmlns for their namespace on html tag.
No other tag is sufficient.

What does it mean? It means that while one can make a document that is styled correctly in these IE7,
document part containing custom tags can not be reused without providing a namespace on the aggregating document.
But it not an extremely important point, since for aggreagation one does not actually control styles as well.

So, practically speaking, one can create a document that uses custom XML tags for the cost of formal document validity.
(The document can still be made formally valid by using custom DTD, but this will put IE and FF into quirks mode).

By the way, the challenge of adding custom tags to HTML was faced by MathML (mathematical markup language) community for years.
If you are interested, you can read these discussions:

Personally, I still see microformats as a step in wrong direction.
While hCard provides HTML with a way to express the vCard semantics, I would prefer it to be just a HTML-compatible way, not the recommended one.
I see HTML as standard that needs support, but not popularized extensions.