Sins of .NET API Developers

March 25th, 2008

There are several annoying design flaws I often stumble into in .NET APIs.
I haven’t seen Design Guidelines on this matter, so I think I’ll point to 3 of these myself.

  1. The Generic Sin

    Do provide a non-generic overloads to generic utility methods.
    Ayende already wrote about it, so it does not make sense to repeat the reasons.
    Fortunately, this is a very rare issue.

    Violating framework: Castle.MicroKernel (DefaultKernel.ResolveServices<T>).

  2. The Sin of Shallow Digging

    Do support non-public members when performing type/member discovery.
    Do not ever use Assembly.GetExportedTypes unless this logic can easily be overriden.

    For example, I like to make my Domain Service interfaces public, but keep implementations internal.
    This means that if I use InternalsVisibleTo, I can unit-test implementations, but the clients must use interfaces.
    But if I try to define a common generic test base class, and use it like XTest : TestBase<X>, then if X is internal XTest should also be.
    Now neither MbUnit or TestDriven.Net see my test, regardless of TestFixture attribute.

    So if you use attributes to discover members automatically, the only reason to discard a member should be the absense of an attribute.
    Access modifiers should not be considered, if the member does have an attribute.

    Violating frameworks: MbUnit, Castle.Facilities.BatchRegistration.

  3. The Sin of Tivoization

    For each public interface you create, think how user can inject his own implementation.
    (If this interface is directly consumed by API public methods, you can skip this step).

    For example, let’s look at an interface MbUnit.Core.IFixtureFactory.
    It is public, which seems good enough for people who hate internals.
    But it is not possible to provide your implementation without wrapping the whole test-running engine.

    It’s a pity. The framework I am developing right now would really benefit from it.

    Violating frameworks: MbUnit.

Microsoft’s VSTS lacks RowTest and that’s why I will not use it.
But there is one thing about VSTS that is done right.
It is naming.

We use TestFixture so often, but what is it exactly?
Let’s go to wikipedia:

Test fixture refers to the fixed state used as a baseline for running tests in software testing.
The purpose of a test fixture is to ensure that there is a well known and fixed environment in which tests are run so that results are repeatable.

If I go and ask all developers I know, no one would be able to repeat this definition.

And it is why unit testing was really hard for me to understand.
TestClass and TestMethod are practical names that desribe what we want to do.
TestFixture is an academic name that describes what we are supposed to do.
It conveys the feeling that I should read a tome of obscure knowledge to get it.
While I am fine with reading, this time things just seem to be hard.

Take VB’s Shared versus C# static.
How many people you know understand why static keyword has this name?
But we still have to live without useful features just because of the name.

The moral of the story: naming is very important.
It can make things unnecesarily complex or decrease your agility in changing underlying behavior.

Sometimes, large company is no better than several independent companies.
I always get this thought when looking at ASP.NET AJAX property accessors.

For people not familiar with the matter:
Microsoft ASP.NET AJAX requires developer to use property accessors to encapsulate javascript fields.
This would be a great idea, if only Microsoft IE supported javascript getters and setters.
Mozilla supports them, and Opera will in 9.50, but IE does not (they are not standard).
Also, IE does not have any hacky way to achieve the same effect (except in VBScript, which seems to be limited).

So in ASP.NET AJAX each property requires two functions named get_propertyName and set_propertyName.

Fortunately, javascript is very powerful in class member manipulation.
So I do not have to write things like this:

MyControl.prototype = {
    get_text : function() {
        return this._text;
    },

    set_text : function(value) {
        this._text = value;
    }
}

And I do not have to wait for auto properties as it is the case for C#.

Instead I just made an helper object named Auto and do things this way:

MyControl.prototype = {…}
Auto.properties(MyControl.prototype, [
    'text',
    'value',
    …
]);

Code for this helper can be very simple.
(in actual project I have additional complexity like Auto support for INotifyPropertyChanged).

var Auto = {
    property : function(prototype, name) {
        var getter = function() { return this['_' + name]; };
        prototype['get_' + name] = prototype['get_' + name] || getter;

        var setter = function(value) { this['_' + name] = value; };
        prototype['set_' + name] = prototype['set_' + name] || setter;
    },

    properties : function(prototype, names) {
        names.forEach(function(name) { 
            Auto.property(prototype, name);
        });
    }
}

This also requires forEach method on Array, but that one is pretty obvious.

I like to show this code when I hear about Javascript being assembly language and whatever-to-javascript compilation.

NMock 3.5

September 23rd, 2007

One of the places where I am likely to actively use expression trees is testing/mocking.
(I am not the first one to notice this — I saw some post on expression tree asserts, but I lost it).

In my day-to-day development I am using NMock2.
Ayende‘s Rhino Mocks may be more convenient, but I do not like the syntax.
I just can’t read “expect call return” — my mind wants “expect call returns”.
Also, event support in Rhinos is really cryptic.
(I understand that it may be the only stringless way to express it).

So working on C# 3.0 project, I’ve looked into a stringless experience with NMock2.
What I wanted was to pass a method call expression, to change this:

Expect.Once.On(fs)
    .Method("GetDescendantDirectories")
    .With(RootPath)
    .Will(Return.Value(directories));

into this

Expect.Once
    .That(() => fs.GetDescendantDirectories(RootPath))
    .Will(Return.Value(directories));

Fortunately, it was quite easy to do with extension methods.
I have spent more time figuring better fluent syntax than actually writing it.

But the case where expression trees really shine, I think, is in the possibility to do this:

Expect.Exactly(directory.Files.Count)
    .That(() => access.IsPublicFile(Any.String))
    .Will(Return.Value(true));

I can analyze expression tree and find out that Any.String is not “a string”, but “any string”.
It is also easy to imagine Any.String.Matching(“^whatever”) and so on.
I have not tried to implement it, but I feel that it would be simple as well.

Also, while writing the post, I just got an idea — in the world of extenson methods we can do

Expect.Exactly(directory.Files.Count)
    .That(() => access.IsPublicFile(Any.String))
    .Will.Return(true);

without sacrificing extensibility.
“Will” can also be chained if more than one action is needed (however, a delegate would be better for this).

It would be nice to actually add this to NMock2 (as 3.5 branch), but they are using CVS, not SVN.
Also, am I the only one who thinks that SourceForge is slow and has hardly usable (bloated) design?

So if you want the code, you can ask me in comments, but it is primitive indeed.

Some days ago I wrote an overview of typeswitch problem.
Now it’s time to give some solutions.

Overview

Let’s imagine a document inheritance hierarchy:

XsltDocument : XmlDocument : Document
TextDocument : Document

Let’s assume I want to get strings “Xml”, “Xslt”, “Not Xml and not Xslt” based on the document runtime type.
This is primitive indeed, but it does demonstrate a concept.

I call the most useful soultion fluent switch:

string result = Switch.Type(document).
        Case(
            (XsltDocument d) => "Xslt"
        ).
        Case(
            (XmlDocument d) => "Xml"
        ).
        Otherwise(
            d => "Not Xml and not Xslt"
        ).
        Result;

It does contain a lot of visual clutter, but it scales quite well in comparison to if/return approach.
The code behind is simple — each Case checks type and returns itself.
Case is a generic method whose parameters are inferred from the lambda.

For this task, case bodies do not depend on actual object contents.
So they can be expressed cleaner:

string result = Switch.Type(document).To<string>().
        Case<XsltDocument>("Xslt").
        Case<XmlDocument>("Xml").
        Otherwise("Not Xml and not Xslt").
        Result;

This time I have to specify the result type explicitly.

The fluent switch syntax is quite powerful — you can even add cases dynamically.
This is a nice difference from conventional language constructs.

Alternatives

I have tried a number of alternatives, but no one of them did better.

  1. Many overloads switch
    string result = Switch.Type(
        document, 
            (XsltDocument d) => "Xslt",
            (XmlDocument d) => "Xml",
            d => "Not Xml and not Xslt"
    );

    This syntax is quite concise and understandable.
    But it requires an additional overload for each additional case.
    So it is quite impractical.

  2. Object initializer switch
    string result = new TypeSwitch<Document, string>(document) {
        (XsltDocument d) => "Xslt",
        (XmlDocument d) => "Xml",
        d => "Not Xml and not Xslt"
    };

    Also more concise than my original solution, but much more cryptic.
    Constructor and generic parameters also add a degree of confusion.

    The most interesting thing about this syntax was that it actually worked.
    It seems object initializers have some nice fluent power.

Compilation

After running some benchmarks, I found that fluent switch is about 200 times slower than hardcoded ifs.
It may be perfectly acceptable, of course.
However, I have found a way to precompile the switch using expression trees.

From the usage perspective, precompiled switch is just a Func<T, TResult> (it does not support Actions right now).
So you can cache it in

private static readonly Func<Document, string> CompiledFluentLambdaSwitch = Switch.Type<Document>().To<string>().
    Case(
        (XsltDocument d) => "Xslt"
    ).
    Case(
        (XmlDocument d) => "Xml"
    ).
    Otherwise(
        d => "Not Xml and not Xslt"
    ).
    Compile();

which is extremely similar to the first code sample.
The differences are that you do not specify what you are switching on (it would be a function parameter).
But you do explicitly specify from/to types.

The compilation process was fun to write, since it was the first time I dug into expressions trees.
Statements are not supported in trees, so I had to use embedded ConditionalExpressions for cases.
The resulting tree is something like

d => (d is XsltDocument)
             ? ((cast => "Xslt")(d as XsltDocument)) 
             : ((d is XmlDocument) 
                   ? ((...

I have not found a way to cache cast and null-check it, so I cast/typecheck it two times.

Benchmarks

The best thing about compilation is performance:

Benchmark: 1000000 iterations, two switch calls per iteration.

Benchmark overhead:            40.1ms   30.0ms   30.0ms   30.0ms |    32.5ms
Direct cast:                   80.1ms   60.1ms   80.1ms   60.1ms |    70.1ms
Fluent switch
    on lambdas:              1512.2ms 1502.2ms 1602.3ms 1482.1ms |  1524.7ms
    on lambdas (compiled):     90.1ms  110.2ms   80.1ms   80.1ms |    90.1ms
    on constants:            1281.8ms 1271.8ms 1311.9ms 1271.8ms |  1284.3ms
    on constants (compiled):   80.1ms   90.1ms   90.1ms   70.1ms |    82.6ms
Many overloads switch:        440.6ms  390.6ms  430.6ms  420.6ms |   420.6ms
Object initializer switch:    751.1ms  681.0ms  741.1ms  751.1ms |   731.1ms

As you can see, precompiled switch is nearly as performant as hardcoded one (direct cast).
I am quite impressed by simplicity/power ratio of the expression trees.

Code

I uploaded AshMind.Constructs to Google Code.
I see it as a learning/research project, but you can put it to any practical use.

This is the first post in a two-post series on a typeswitch implementation in C#.
This one contains a problem statement and possible solutions in other languages.
The second one will contain a Switch.Type description and benchmarks.

Somewhat often I find myself writing code to do something based on runtime type of a value.
A classic case is to filter a tree with different types of nodes (expression tree, for example).

The code often looks like

if (x is A) {
    DoWithA(x as A);
}
else if (x is B) {
    DoWithB(x as B);
}
else
    // ...

Or, if you are a heavy performance freak like me it is like

A a = x as A;
if (a != null) {
    DoWithA(a);
    return;
}

B b = x as B;
if (b != null) {
    DoWithB(b);
    return;
}

// ...

After second type it really starts to smell.

I could have used a Visitor.
But I really dislike it due to the coupling between the Visitor interface and the underlying class hierarchy.
Also requires me to extend hierarchy with a zero value Accept method.

I can also use some kind of hashtable-based smart resolver, but it would be complex and slow.

Actually, that is not an obscure problem and a lot of other languages have their solutions.
There are two common ones:

  1. OO concept known as multiple dispatch.
    Multiple dispatch is just a bunch of “method overloads” resolved by runtime environment basing on the runtime argument types.
    This is quite different from ordinary method overloading — for example, in C# compiler picks an overloaded method during compilation.

    Actually, .Net has a way to do multiple dispatch through Reflection (Type.InvokeMethod), but it quite slow and not compiler-type-safe.

    There is a brilliant paper “Generalized Interfaces for Java” that gives some insight on useful multiple dispatch in Java/C#-like languages.
    Hopefully we’ll get that functionality in C# and CLR sooner or later.

  2. Functional language concept known as pattern matching.
    This is a kind of powerful switch/case statement (with a simplified syntax).
    I do not actually know much about functional languages, so that is my understanding.

The simplest possible construct (I do not want to dive into multiple dispatch) might look like this:

typeswitch (x) { 
   case (A a): DoWithA(a); break; 
   case (B b): DoWithB(b); break; 
   default: throw new ArgumentException(); 
}

And lcs Research C# Compiler has a similar syntax sample:

typeswitch (o) { 
    case Int32 (x): Console.WriteLine(x); break; 
    case Symbol (s): Symbols.Add(s); break; 
    case Segment: popSegment(); break; 
    default: throw new ArgumentException(); 
}

Cω language also had an actual typeswitch construct.
But I was not able to find out it’s syntax (my old VS.Net is somewhy ruined and web is silent on it).
Anyway, Andrey Titov (who I hope will also blog someday) reminded me that it compiled to zero IL (seems it was too experimental).

Stay tuned, next time we’ll see how it is possible to emulate typeswitch in C# 3.0.

Everything in this article is tested with Visual Studio 2008 Beta 2 and may become obsolete.

While a lot of people blog about expression trees in new C#, I haven’t seen any post about things you can not do with them. Maybe it is common knowledge, but since I stumbled in it myself, I’ll share.

Basically, C# specification says:

Not all anonymous functions can be represented as expression trees. For instance, anonymous functions with statement bodies, and anonymous functions containing assignment expressions cannot be represented. In these cases, a conversion still exists, but will fail at compile time.

So, that’s what you can not do according to the specification:

Expression<…> y = x => { DoAnything() };
// error CS0834: A lambda expression with a statement body cannot be converted to an expression tree

int z = 3;
Expression<…> y = () => z = z + 5;
// error CS0832: An expression tree may not contain an assignment operator.

To find out other limitations, I’ve looked in Microsoft.NET\Framework\v3.5\1033\cscompui.dll file (that contains all string resources (errors/warnings) for csc compiler) and did a search on “expression tree”.

So there is a summary table of all compiler errors on expression trees with sample code for each case:

Error code Error message Sample code
CS???? Partial methods with only a defining declaration or removed conditional methods cannot be used in expression trees. I see no way to use partial in expression tree.
Partial methods always return void, so they can be used only as a statement and not in a lambda with expression body.
CS0831 An expression tree may not contain a base access.
Expression<Func<string>> y = () => base.ToString();
CS0832 An expression tree may not contain an assignment operator. See above.
CS0834 A lambda expression with a statement body cannot be converted to an expression tree. See above.
CS0838 An expression tree may not contain a multidimensional array initializer.
Expression<Func<string[,]>> y = 
    () => new string[,] { { "A", "A"} };
CS0838 An expression tree may not contain an unsafe pointer operation. No sample, I am not very friendly with unsafe syntax.
CS1945 An expression tree may not contain an anonymous method expression.
Expression<Func<Func<string>>> y = () => delegate { return "hi"; };
// [NB] This woks just fine:
Expression<Func<Func<string>>> y = () => () => "hi";
CS1952 An expression tree lambda may not contain a method with variable arguments. This one was tricky:

public string ReturnHi(__arglist)
{
    return "hi";
}

public void Test()
{
    Expression<Func<string>> y = 
        () => this.ReturnHi(__arglist("stub1", "stub2"));
}

There are several error messages other than the first one in a table that I can not produce at all.

Error message Comment
An expression tree lambda may not contain an out or ref parameter. Lambdas indeed can not capture out and ref parameters, but the error in such case is
CS1628: Cannot use ref or out parameter ‘…’ inside an anonymous method, lambda expression, or query expression.
I could not create any kind of lambda for the delegate type with out or ref parameter, not only expression tree.
An expression tree lambda may not contain a member group. No problems creating lambdas that do any member group resolution. I have no idea how to put unresolved member group anywhere without getting it resolved.

Evaluating Javascript in WatiN

September 5th, 2007

The WatiN framework is quite cool, but it lacks two important things.
First one is searching by CSS selectors, or, at least, classes.
Find.ByCustom(“className”, “X”) is way too ugly. Or am I missing something?

The second (more important) one is a weak access to Javascript.
First thing I wanted to do with WatiN was to change something and then check some script state.
And getting some values from script was not obvious.

I didn’t want to use Ayende’s evil hack (no harm intentended, it gets the work done) — putting javascript state into DOM is not pretty and too string oriented.
I thought that browser COM interfaces should definitely have a way to get Javascript objects outside, that is just the way MS/COM people think.
Thanks to Jeff Brown’s comment for explaining last obstacles.

So here goes the code.
It is quite basic, but it allows you to get value of any Javascript evaluation.
As you can see, I hadn’t included any error handling, I had no time to look into it.

    public static class JS {
        public static object Eval(Document document, string code) {
            IExpando window = JS.GetWindow(document);
            PropertyInfo property = JS.GetOrCreateProperty(window, "__lastEvalResult");

            document.RunScript("window.__lastEvalResult = " + code + ";");

            return property.GetValue(window, null);
        }

        private static PropertyInfo GetOrCreateProperty(IExpando expando, string name) {
            PropertyInfo property = expando.GetProperty(name, BindingFlags.Instance);
            if (property == null)
                property = expando.AddProperty(name);

            return property;
        }

        private static IExpando GetWindow(Document document) {
            return document.HtmlDocument.parentWindow as IExpando;
        }
    }

Nitpicking:
By the way, Ayende, getting permalinks to comments in your blog is not obvious (I used View Source).

Recently I got an optimization problem in ASP.Net.
To be short, I had a Repeater with custom (somewhat complex) template on my Page, and I wanted to reload it asynchronously.

The first solution was XP and didin’t consider performance at all: wrap Repeater inside an UpdatePanel.
The problem was that the entire Page had to be repopulated on server just to get to the Repeater.

That gave me a choice of two headaches:

  1. Put all Page/Controls data into ViewState and bloat bandwidth.
  2. Query all additional data on the reload request and increase load on database to get data that will be thrown away.

To be honest, I could solve (2) with server-side cache, but, in my opinion, caching does not make ugly solutions any better, just faster.

So, naturally, my thought was to query the data-only WebService and then populate the Repeater on client.

And it was interesting to find out that Microsoft already has a client-side data binding solution within ASP.Net AJAX Futures.
I have found an excellent article on this matter by Xianzhong Zhu, “Unveil the Data Binding Architecture inside Microsoft ASP.NET Ajax 1.0″ (Part 1, Part 2).

I will now give a quick summary on the overall client-side binding architecture.
In essence it is quite similar to the smart DataSource controls of ASP.Net 2.0:
There is a DataSource javascript component and a ListView javascript control with html template.
ListView passes data from/to DataSource control, and DataSource talks with a JSON Web Service as a backend.
Controls and their relations are described in text/xml-script (Futures-only feature).

Everything seems quite straightforward and easy to use, I was quite happy to find it.
One thing that bothers me is the performance of text/xml-script (it is parsed on client).
But it is a concern not related to the current story.
The other question is what to do when I want to databind a complex list (consisting of several embedded server user controls) ?
I am going to find it out real soon.

Along the way, I have also noticed Sys.Preview.Data also introduces DataSets/DataTables to javascript.
That is quite funny. Personally, I never really considered DataSets acceptable anywhere above Persistence layer.
But I already thought about Persistence/DataAccess concept in javascript when I saw Gears.
And DataSets seem to fit ‘nicely’ to some GoogleGearsDataSource (it would be quite an experience to actually see one in real code).

Well, javascript O/R Mapper, anyone ?

As a side project, I am creating a new Project Type for Visual Studio using Managed Package Framework from VS 2005 SDK.
I have read an excellent post on the matter, so mostly it was a piece of cake.
But I had one problem: Build was not available as a menu item and Build Selection was greyed out when I selected my custom project.

Adding Target Name=”Build” to the project template and specifying it in DefaultTargets did not help.

After some experiments I have found out a minimal *proj file that has Build menu item available (if you have already inherited the MPF ProjectNode).
It is quite interesting:

<Project xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
  <PropertyGroup Condition=" '$(Configuration)' == 'Debug' " />
</Project>

It seems that the Configuration comparison gets parsed into the Configuration values for this project.
I am not sure whether Visual Studio or MPF does this.

Of course, if you actually want this menu item to work, you will have to add a default target, but that’s quite easy.