U2U Consult TechDays 2011 CD Available for Download

At TechDays 2011 in Antwerp, U2U Consult distributed a CD-ROM with two free tools. I’m happy to announce that the CD-ROM contents is now also available for download from our web site.

The U2U Consult SQL Database Analyzer is a tool for SQL Server database administrators and developers. It displays diagnostic information about a database and its hosting instance that is hard to collect when you only use the standard SQL Server tools. Just point the tool at a SQL Server database of your choice, and have a look at the reports generated by the tool.

The U2U Consult Code Analysis Rules for Visual Studio 2010 are a series of additional code analysis rules for Visual Studio 2010 Premium or Ultimate, and two rule sets with recommended rules for libraries and applications. The rules include additional general performance and design rules, as well as a series of rules specifically for WCF. All rules are documented on the CD-ROM. Obviously, they are applicable to all .NET languages, including C# and VB.

With this CD, for the first time we make two of our own tools available to you. These are only two small components out of the U2U Consult Framework, but we hope they are as useful to you as they are to us and our clients. Enjoy.

Farewell Visitor

The Visitor design pattern was first documented in 1995 by the Gang of Four. It’s a workaround for the fact that most strongly typed object oriented languages only support single dispatch, even when sometimes double dispatch is required. With C# 4, we no longer need this workaround. We now have something better, more on that below. Let’s look at an example.

The traditional Visitor pattern

The System.Linq.Expressions namespace contains types that enable code expressions to be represented as objects in the form of expression trees. For example, the following C# statement creates an expression tree:

Expression<Func<double, double>> f = x => Math.Sin(1 + 2 * x);

 

There are many kinds of expressions. The above example creates several objects of different types, including ParameterExpression, ConstantExpression, BinaryExpression and MethodCallExpression, all of which inherit from Expression.

There are many ways to represent expressions as text. For example, we can use an infix notation (as most programming languages do), but we can also use prefix or postfix notation. Anyone who as ever worked with an HP scientific calculator, or a programming language such as Forth, will appreciate postfix notation, also known as reverse polish notation.

As a result, the way to translate a particular expression into a text representation depends on two things: the kind of expression and the kind of notation. More precisely, the method to execute to translate an expression object into a string object depends on the type of the expression, and on the type of the translation algorithm. Using a virtual function would allow the system to choose a method based on one of these dimensions, e.g. the type of expression, but not on both. Virtual functions provide single dispatch, but we need dual dispatch.

In fact, the expression class has a virtual ToString() method, inherited from object. Every type inheriting from Expression has its own version, making the algorithm depend on the type of expression. But it’s a hardcoded implementation, using an infix notation. What if we want a postfix ToString? Or Prefix? Or a C# or F# syntax? This is where the Visitor pattern can help us, and luckily the Expression class has support for it. The class ExpressionVisitor is an abstract base class for algorithms working on expressions. Now when I say algorithms, you probably think of methods with parameters and return values, but that’s not how a Visitor works. A Visitor is an object of some class, and parameters must be passed in, typically via the constructor. The return value, i.e. the result of the algorithm must be read back from a property. Let’s create a base class for our ToString visitors:

public abstract class ToStringVisitor : ExpressionVisitor
{
   
protected readonly StringBuilder resultAccumulator = new StringBuilder
();
 
   
public string
Result
    {
       
get { return
resultAccumulator.ToString(); }
    }
}

This provides us with a base class for Visitors that have a string Result property. There are some issues with it, such as when to Clear() the StringBuilder when the Visitor is reused, but let’s not get into those. We can now create a ToPostfixStringVisitor, and encapsulate it behind a static method:

public static class ExpressionExtensions
{
   
public static string ToPostfixString(this Expression<Func<double, double
>> function)
    {
       
var visitor = new ToPostFixStringVisitor
();

        visitor.Visit(function);

       
return
visitor.Result;
    }

   
private class ToPostFixStringVisitor : ToStringVisitor
    {
       
protected override Expression VisitLambda<T>(Expression
<T> node)
        {
           
// enables reusing the visitor – not absolutely required here as the only
            // place where an instance can be created is in the ToPostfixString method.
            this
.resultAccumulator.Clear();

           
foreach (var parameter in
node.Parameters)
            {
               
this
.Visit(parameter);
            }

           
this.resultAccumulator.Append("-> "
);

           
this
.Visit(node.Body);

           
return
node;
        }

       
protected override Expression VisitParameter(ParameterExpression
node)
        {
           
this
.resultAccumulator.Append(node.Name);
           
this.resultAccumulator.Append(' '
);

           
return
node;
        }

       
protected override Expression VisitBinary(BinaryExpression
node)
        {
           
this
.Visit(node.Left);
           
this
.Visit(node.Right);

           
switch
(node.NodeType)
            {
               
case ExpressionType
.Add:
               
case ExpressionType
.AddChecked:
                   
this.resultAccumulator.Append('+'
);
                   
break
;
               
case ExpressionType
.Multiply:
               
case ExpressionType
.MultiplyChecked:
                   
this.resultAccumulator.Append('*'
);
                   
break
;
               
case ExpressionType
.Subtract:
               
case ExpressionType
.SubtractChecked:
                   
this.resultAccumulator.Append('-'
);
                   
break
;
               
case ExpressionType
.Divide:
                   
this.resultAccumulator.Append('/'
);
                   
break
;
               
case ExpressionType
.Modulo:
                   
this.resultAccumulator.Append('%'
);
                   
break
;
                default
:
                   
throw new NotSupportedException
();
            }

           
this.resultAccumulator.Append(' '
);

           
return
node;
        }

       
protected override Expression VisitMethodCall(MethodCallExpression
node)
        {
           
foreach (var arg in
node.Arguments)
            {
               
this
.Visit(arg);
            }

           
this
.resultAccumulator.Append(node.Method.Name);
           
this.resultAccumulator.Append(' '
);

           
return
node;
        }

       
protected override Expression VisitConstant(ConstantExpression
node)
        {
           
this
.resultAccumulator.Append(node.Value);

           
this.resultAccumulator.Append(' '
);

           
return
node;
        }
    }
}

 

For example, the following line outputs x -> 1 2 x * + Sin

Console.WriteLine(ExpressionExtensions.ToPostfixString(x => Math.Sin(1 + 2 * x)));

 

It works, even though for the sake of example it only supports a very small subset of all expressions. At least in the case of binary operators, it throws a NotSupportedException for operators that are, well, not supported. I really should add a bunch of other methods, for example:


       
protected override Expression VisitConditional(ConditionalExpression
node)
        {
           
throw new NotSupportedException
();
        }

       
protected override Expression VisitBlock(BlockExpression
node)
        {
           
throw new NotSupportedException
();
        }

 

Anyway, how does it work? The Visit() method calls an internal virtual method on Expression called Accept. Being virtual, this chooses what kind of expression to work on an it calls the appropriate VisitX method in the visitor. This one is virtual as well, and it chooses the correct algorithm, our ToPostfixStringVisitor in this case.

So we have double dispatch, via a combination of two single dispatch calls.

Dynamic dispatch to the rescue

As of C# 4, we are not restricted to single dispatch, we now have dynamic dispatch. Let’s see what this example looks like using dynamic:

public static class ExpressionExtensions
{
   
public static string ToPostfixString(this Expression<Func<double, double
>> function)
    {
       
StringBuilder resultAccumulator = new StringBuilder
();

        Visit(function, resultAccumulator);

       
return
resultAccumulator.ToString();
    }

   
private static void Visit(Expression expression, StringBuilder
resultAccumulator)
    {
       
dynamic
dynamicExpression = expression;

        VisitCore(dynamicExpression, resultAccumulator);
    }

   
private static void VisitCore(LambdaExpression node, StringBuilder
resultAccumulator)
    {
       
foreach (var parameter in
node.Parameters)
        {
            Visit(parameter, resultAccumulator);
        }

        resultAccumulator.Append(
"-> "
);

        Visit(node.Body, resultAccumulator);
    }

   
private static void VisitCore(ParameterExpression node, StringBuilder
resultAccumulator)
    {
        resultAccumulator.Append(node.Name);
        resultAccumulator.Append(
' '
);
    }

   
private static void VisitCore(BinaryExpression node, StringBuilder
resultAccumulator)
    {
        Visit(node.Left, resultAccumulator);
        Visit(node.Right, resultAccumulator);

       
switch
(node.NodeType)
        {
           
case ExpressionType
.Add:
           
case ExpressionType
.AddChecked:
                resultAccumulator.Append(
'+'
);
               
break
;
           
case ExpressionType
.Multiply:
           
case ExpressionType
.MultiplyChecked:
                resultAccumulator.Append(
'*'
);
               
break
;
           
case ExpressionType
.Subtract:
           
case ExpressionType
.SubtractChecked:
                resultAccumulator.Append(
'-'
);
               
break
;
           
case ExpressionType
.Divide:
                resultAccumulator.Append(
'/'
);
               
break
;
           
case ExpressionType
.Modulo:
                resultAccumulator.Append(
'%'
);
               
break
;
           
default
:
               
throw new NotSupportedException
();
        }

        resultAccumulator.Append(
' '
);
    }

   
private static void VisitCore(MethodCallExpression node, StringBuilder
resultAccumulator)
    {
       
foreach (var arg in
node.Arguments)
        {
            Visit(arg, resultAccumulator);
        }

        resultAccumulator.Append(node.Method.Name);
        resultAccumulator.Append(
' '
);
    }

   
private static void VisitCore(ConstantExpression node, StringBuilder
resultAccumulator)
    {
        resultAccumulator.Append(node.Value);
        resultAccumulator.Append(
' '
);
    }

   
private static void VisitCore(Expression node, StringBuilder
resultAccumulator)
    {
       
throw new NotSupportedException
();
    }
}

 

The dynamic dispatch is achieved by the Visit method. To learn more about how this works, see http://blogs.msdn.com/b/samng/archive/2008/11/06/dynamic-in-c-iii-a-slight-twist.aspx.

So how is this better?

First of all, this approach works with all classes. The Expression class does have visitor support baked in, but most classes don’t. The dynamic approach also works if the target classes don’t support the visitor pattern. That also means that you don’t have to do anything special with your own classes to enable this technique.

The dynamic approach is also simpler. I’ve noticed that most people don’t immediately “see” the visitor pattern, but the dynamic approach is easier to understand.

Visitor implementations typically have methods that return void, and producing a result must be accomplished via fields and properties (the ExpressionVisitor is a notable exception here, it is optimized for rewriting expressions, i.e. calculating a new expression based on an existing one). With dynamic methods, you choose your parameter and return types (the StringBuilder in this example). Not only is that much simpler to code, the entire thing has no state in fields, only in stack local variables. As a result, it’s completely reentrant and thread-safe.

Note also that the last VisitCore method specifies what to do with expressions that aren’t handled by any of the other methods. Much more convenient than with a Visitor, where you always have to specify a method for each concrete type, unless it just so happens that the behavior you want is the default behavior from the base class.

Conclusion

The dynamic keyword was introduced to facilitate interoperability with dynamic languages and systems, including COM. C# remains primarily a strongly typed language. As such, some people suggested that dynamic has no place in plain C# programs that don’t require such interoperability. However, the above example shows that dynamic dispatch can be very useful in the context of strongly typed C# programs, as an alternative to the Visitor pattern.

Lambda Curry in F#

Bart De Smet commented on my post about Lambda Curry in C#, saying (amongst other things) that F# supports currying out of the box.

That’s true, and it’s a nice feature of the language. However, it is a mechanical operation, almost identical to what the following C# extension method does:

public static class FunctionalExtensions
{
   
public static Func<T2, TResult> Curry<T1, T2, TResult>(this Func
<T1, T2, TResult> func, T1 value)
    {
       
return value2 => func(value, value2);
    }
}

The important point to note is that F# does not perform partial evaluation automatically, which is where in my mind most of the benefit comes from.

To illustrate, consider the following function definition in F#:

open System

let compute x y = Math.Sin(float x) * Math.Sin(float y)

This is exactly the same as the following, illustrating the automatic currying:

let compute x = fun y -> Math.Sin(float x) * Math.Sin(float y)

And when I say “exactly the same”, I do mean just that: they compile to the exact same IL.

If you want the partial evaluation, and the performance benefit of it, you’ll have to do it manually, also in F#:

let compute' x =
   
let
sinx = Math.Sin(float x)
   
fun y -> sinx * Math.Sin(float y)

To illustrate, consider the following program, which is more or less analogous to my previous example:

open System
open
System.Diagnostics

let
compute x y = Math.Sin(float x) * Math.Sin(float y)

let
compute' x =
   
let
sinx = Math.Sin(float x)
   
fun y ->
sinx * Math.Sin(float y)

let
sum f =
   
let mutable
sum = 0.0
   
for x = -1000 to 1000 do
        let
f' = f x
       
for y = -1000 to 1000 do
            sum <- sum + f' y
    sum

let
measureTime f =
    let sw = Stopwatch.StartNew()
   
let
_ = sum f
    sw.ElapsedMilliseconds

printfn
"%d"
(measureTime compute)
printfn
"%d" (measureTime compute')

On the machine I’m testing this, it prints 329 milliseconds for the compute function, and 137 for the compute’ function.

To be honest, this should not come as a surprise. Even if F# wanted to perform a partial evaluation, how could it? It does not know that Math.Sin is a pure function. So it has no choice but to play safe. It does what the developer tells it to do. So if you want partial evaluation, do it yourself, explicitly, no matter what language you’re using.

Safe disposal of WCF proxies

The issue has been known for a long time: you cannot safely dispose a WCF proxy that inherits from ClientBase<T> with a using statement, because the dispose method may throw.  For more information, see http://www.google.com/search?q=wcf+proxy+dispose.

However, I’ve always found most solutions out there to be too complex, too cumbersome, or both. Some of them even seek to replace the Visual Studio (or svcutil.exe) generated proxy by handwritten code, in which case you loose the convenience of the “Add Service Reference” dialog and the automatic configuration file generation. I wanted a simple but effective solution that does not replace the Visual Studio code generation, and is easy to use at the same time.

I came up with a simple extension method, that you use as follows:

var proxy = new SomeServiceClient();
using (proxy.SafeDisposer())
{
    // use the proxy here
}

Here’s the code:

namespace U2UConsult.ServiceModel
{
    using System;
    using System.Diagnostics;
    using System.ServiceModel;

    public static class ClientBaseExtensions
    {
        private static readonly TraceSource traceSource = new TraceSource("U2UConsult.ServiceModel");

        public static IDisposable SafeDisposer<T>(this ClientBase<T> proxy) where T : class
        {
            return new SafeDisposerProxy<T>(proxy);
        }

        private sealed class SafeDisposerProxy<T> : IDisposable where T : class
        {
            private ClientBase<T> proxy;

            public SafeDisposerProxy(ClientBase<T> proxy)
            {
                this.proxy = proxy;
            }

            public void Dispose()
            {
                if (this.proxy != null)
                {
                    if (this.proxy.State == CommunicationState.Opened)
                    {
                        try
                        {
                            this.proxy.Close();
                        }
                        catch (Exception ex)
                        {
                            ClientBaseExtensions.traceSource.TraceEvent(
                                TraceEventType.Error, 
                                0, 
                                "Could not close client proxy, reason: {0}", 
                                ex.Message);
                            this.proxy.Abort();
                        }
                    }
                    else
                    {
                        this.proxy.Abort();
                    }

                    this.proxy = null;
                }
            }
        }
    }
}

Enjoy!

Lambda Curry

Note: if you’re looking for lamb curry, you came to the wrong place. This post is about C# programming techniques.

Currying a function is a technique named after Haskell Curry, to transform a function with multiple parameters into a series of functions having one parameter each. The technique is important, because it opens the door to an optimization technique called partial evaluation. Let’s look at an example.

Let’s say you need to write a program that sums a two-dimensional function f(x,y) over a two-dimensional range, e.g. –1000 ≤ x ≤ 1000 and –1000 ≤ y ≤ 1000.

Such a two-dimensional function, assuming double parameters and result, can be represented by Func<double, double, double>, and we can sum it using the following method:

        private static double Sum(Func<double, double, double> f)
        {
            double sum = 0;
            for (int x = -1000; x <= 1000; x++)
            {
                for (int y = -1000; y <= 1000; y++)
                {
                    sum += f(x, y);
                }
            }

            return sum;
        }
 

We can apply this to an arbitrary function, for example:

            Func<double, double, double> f = (x, y) => Math.Sin(x) * Math.Sin(y);

            double result = Sum(f);
 

Currying this is now a simple textual transformation. Instead of defining f as Func<double, double, double>, we define it as Func<double, Func<double, double>>.

    using System;

    internal static class Curried
    {
        public static void Main()
        {
            Func<double, Func<double, double>> f = x => y => Math.Sin(x) * Math.Sin(y);

            double result = Sum(f);
        }

        private static double Sum(Func<double, Func<double, double>> f)
        {
            double sum = 0;
            for (int x = -1000; x <= 1000; x++)
            {
                for (int y = -1000; y <= 1000; y++)
                {
                    sum += f(x)(y);
                }
            }

            return sum;
        }
    }
 

Effectively, a function that took two parameters and returned a result is replaced by a function that takes one parameter and returns a function that takes the second parameter and returns the result. It looks a lot simpler than it sounds. Instead of writing f = (x, y) => Math.Sin(x) + Math.Sin(y), we write f = x => y => Math.Sin(x) + Math.Sin(y). And when calling it, instead of writing f(x, y), we write f(x)(y). Simple.

Unfortunately, every call to f(x) now allocates a new Func<double, double> object, and that can become quite expensive. But that can be fixed easily, so here is a smarter solution:

    using System;

    internal static class Smarter
    {
        public static void Main()
        {
            Func<double, Func<double, double>> f = x => y => Math.Sin(x) * Math.Sin(y);

            double result = Sum(f);
        }

        private static double Sum(Func<double, Func<double, double>> f)
        {
            double sum = 0;
            for (int x = -1000; x <= 1000; x++)
            {
                var fx = f(x);
                for (int y = -1000; y <= 1000; y++)
                {
                    sum += fx(y);
                }
            }

            return sum;
        }
    }
 

I ran a little benchmark on this code. The benchmark executes the main function 20 times, and measures the shortest execution time. It also measures the number of generation 0 garbage collections. This is the result:

Naive: 261 msec, 0 collections
Curried: 340 msec, 733 collections
Smarter: 254 msec, 0 collections
 

As we can see, the curried version was initially slower due to all the memory allocations, but when we fixed that, the smarter version was as fast as the original. In fact is was just a little bit faster, though nothing to get existed about.

However, this is where partial evaluation kicks in. Currently, we are calculating the sinus of x over one million times, not taking advantage of the fact that we could reuse each calculated value a thousand times! So let’s change our definition of f as follows:

            Func<double, Func<double, double>> f = x => { var sinx = Math.Sin(x); return y => sinx * Math.Sin(y); };
 

Now the benchmark shows a completely different result:

Optimized: 143 msec, 0 collections
 

We went from 261 milliseconds in the original version to 143 milliseconds in this version, in fact almost dividing execution time by two! That’s because, to be precise, in the original version we had two times 2001 * 2001 = 8,008,002 Math.Sin calls, and in the optimized version we have 1 time 2001 plus 1 time 2001 * 2001 = 4,006,002 Math.Sin calls. That is a division by a factor of 1.999, yielding a total execution time reduction by a factor of 1.825 (there is some overhead of course).

Of course, the technique is very much related to loop invariant code motion in imperative programming. For example, imagine a Sum function hardcoded for Sin(x) + Sin(y). Would you write is like this?

        private static double SumSinSin()
        {
            double sum = 0;
            for (int x = -1000; x <= 1000; x++)
            {
                for (int y = -1000; y <= 1000; y++)
                {
                    sum += Math.Sin(x) * Math.Sin(y);
                }
            }

            return sum;
        }

Of course not! At least you would move the calculation of Sin(x) out of the loop over y:

        private static double SumSinSin()
        {
            double sum = 0;
            for (int x = -1000; x <= 1000; x++)
            {
                var sinx = Math.Sin(x);
                for (int y = -1000; y <= 1000; y++)
                {
                    sum += sinx * Math.Sin(y);
                }
            }

            return sum;
        }
 

And that is exactly what we did, except of course that the sum function is parameterized and not hardcoded.

So when would you apply this technique? You would apply it when performance matters, and you have a function that you need to call a lot, that takes more than one parameter, where one parameter varies more than another one (in our example, x remained the same for a long time, while y was different on every call), and part of the function can be evaluated knowing only the value of the parameter that varies the least.

In our example above, we could go even further. For example, we could eliminate the multiplication and even the Sin(y) calculation completely is case Sin(x) is 0 (which would be the case in our example only for x == 0).

            Func<double, Func<double, double>> f = x => 
            {
                if (x != 0.0)
                {
                    var sinx = Math.Sin(x);

                    return y => sinx * Math.Sin(y);
                }
                else
                {
                    return y => 0.0;
                }
            };
 

That is not worth it in this scenario (because the special case applies to less than 0.05 % of all cases), but in some scenarios runtime algorithm specialization can be very significant.

Static Reflection in .NET, part 2

A few weeks ago, I talked about static reflection and its advantages. You’ll remember that the main advantages, compared to the normal reflection API’s, are the compile time checking of parameters and IntelliSense support.

How does it compare at other levels, performance for example? Before we dive into that question, let me state that performance may or may not be important to you. A program that is fast enough is, well, fast enough. It’s unlikely that a (single) reflection call will have a significant impact on, say, the response time of a graphical user interface, and so performance doesn’t matter. If your algorithm requires millions of reflection operations, I’m sure you can rewrite it somehow to reduce that number significantly, and then performance again probably doesn’t matter anymore. That being said, we still want to know, right?

First of all, let’s compare code.

Take this line (using the Example class from the last post):

PropertyInfo pi = typeof(Example).GetProperty("Description");

This line compiles to the following IL (simplified for readability):

ldtoken Example 
call class Type Type::GetTypeFromHandle(valuetype RuntimeTypeHandle) 
ldstr "Description" 
call instance class PropertyInfo Type::GetProperty(string)

Compare that to the following line:

PropertyInfo pi = StaticReflector.Create<Example>().PropertyInfo(e => e.Description);

Which compiles to:

call class IStaticReflector`1<!!0> StaticReflector::Create<class Example>()
ldtoken Example
call class Type Type::GetTypeFromHandle(valuetype RuntimeTypeHandle)
ldstr "e"
call class ParameterExpression Expression::Parameter(class Type, string)
stloc.0 
ldloc.0 
ldtoken instance string Example::get_Description()
call class MethodBase MethodBase::GetMethodFromHandle(valuetype RuntimeMethodHandle)
castclass MethodInfo
call class MemberExpression Expression::Property(class Expression, class MethodInfo)
ldc.i4.1 
newarr ParameterExpression
stloc.1 
ldloc.1 
ldc.i4.0 
ldloc.0 
stelem.ref 
ldloc.1 
call class Expression`1<!!0> Expression::Lambda<class System.Func`2<class Example, string>>(class Expression, class ParameterExpression[])
call class PropertyInfo StaticReflectorExtensions::PropertyInfo<class Example, string>(class IStaticReflector`1<!!0>, class Expression`1<class System.Func`2<!!0, !!1>>)

As you can see, this code doesn’t load the “Description” string, it uses the ldtoken instruction instead. Some bloggers have suggested that this would make it more efficient. Unfortunately, even if the ldtoken instruction is efficient, it is largely offset by the construction of the lambda expression. I ran a little benchmark, in which I compare execution time (in ticks) and memory usage (in generation 0 garbage collection runs) of both approaches, executing each one a million times. This is the result (on my laptop):

Using Reflection       Time:    1089308 Collections:    45
Using StaticReflection Time:   13513777 Collections:   264

As you can see, the Static Reflection approach is about 13.5 times slower than the good old dynamic reflection, and it uses a lot more memory. That should be no surprise either: both cases allocate a PropertyInfo object, but the static case also allocates the expression, which is nothing but food for the garbage collector.

So, one approach seems good at compile time, and the other is good at run time. It seems we’re stuck between a rock and a hard place. But the situation isn’t so bad: we have two options to choose from, each with their pro’s and con’s. What the best one is depends on your requirements, and what you value the most: compile time checking (which may result in productivity and maintainability benefits), or performance.

And who knows, maybe there is a third option, giving the best of both worlds. But that’s for next time.

String.Trim() fixed in .NET 4.0

A long time ago, I wrote a blog post about the problems with String.Trim(). I’m happy to see that all three issues have been addressed in the .NET Framework 4.0.

To start with, Trim() will now be consistent with Char.IsWhiteSpace(). Theoretically, this is a breaking change, but I don’t expect many programs to have a problem with this change. Note that the change is very well documented in the online help.

Secondly, the code of Trim() has been cleaned up considerably. A string that consists entirely of whitespace is no longer scanned twice. I haven’t done any benchmarks, but I expect the performance to be at least as good as for the same function in .NET 2.0 – 3.5.

Last but not least, the frequent abuse of the Trim() function to simply validate strings will greatly decrease with the introduction of the static IsNullOrWhitespace(string value) function, which is much faster than calling Trim().

It’s a small detail, compared to all the other goodies .NET 4.0 brings, but a good addition to the toolbox nonetheless.

Static Reflection in .NET

LINQ expressions have proven to be extremely versatile, popping up in all sorts of areas. “Static Reflection” seem to be the latest hype. But what is static reflection anyway, and why is it good or why is it bad?

Reflection is used to obtain information about the code you are executing, and to use that information to interact with the code dynamically. Sometimes reflection is used to interact dynamically with code that is statically known by a program already. For example, data binding heavily relies on reflection to dynamically read and write properties. The calling program knows about those properties statically, but the data binding libraries do not. In data binding, object properties are often identified by their name, expressed as a string. That string is then used by the libraries to construct a PropertyInfo object.

Time for an example. Given this class:

public class Example
{
    public string Description { get; set; }
}


You can obtain a PropertyInfo object describing the Description property as follows:

PropertyInfo pi = typeof(Example).GetProperty("Description");


We may have an issue here. If I make a typing mistake in the GetProperty call, I don’t get a compiler error. At runtime, the call will return null, probably leading to a NullReferenceException down the road. And of course, Visual Studio Intellisense will not help me to type it right. Also, if I rename the property, for example to “Summary”, the GetProperty call will be broken, without a compile-time error. Static Reflection is one technique to avoid these issues.

Using LINQ expressions, we could create an API that allows us to do something like the following:

PropertyInfo pi = StaticReflector.GetProperty(Example e => e.Description);


The downside of this approach is that it doesn’t work with anonymous types. So I propose a different mechanism. What we need is something that statically gives us access to a type. Any generic interface will do. I propose the following:

public interface IStaticReflector<T>
{
}


Given this interface, we can define a series of extension methods, for example:

public static class StaticReflectorExtensions
{
    public static PropertyInfo PropertyInfo<T, U>(this IStaticReflector<T> obj, Expression<Func<T, U>> selector)
    {
        var body = selector.Body as MemberExpression;
        return body.Member as PropertyInfo;
    }
}


Notice how the obj parameter is not really used in the PropertyInfo method. It does serve a purpose however: it allows us to use type inference on the type T, and I get full Intellisense. For example:

IStaticReflector<Example> reflector = null;
PropertyInfo pi = reflector.PropertyInfo(e => e.Description);


Granted, initializing a variable to null and then calling a method on it is a bit weird. We need a more elegant way to create these things:

public static class StaticReflector
{
    public static IStaticReflector<T> Create<T>()
    {
        return null;
    }
}


Now we can write:

PropertyInfo pi = StaticReflector.Create<Example>().PropertyInfo(e => e.Description);


This still doesn’t work on anonymous types though. For those, we could use the following:

public static class ObjectExtensions
{
    public static IStaticReflector<T> GetReflector<T>(this T obj)
    {
        return null;
    }
}


Now we can write things such as:

var anonymous = new { Description = "Example" };

PropertyInfo pi = anonymous.GetReflector().PropertyInfo(e => e.Description);


I do prefer the StaticReflector.Create<T>() method is case the type name is known though.

Are we done? Not really. Let’s go back to dynamic reflection using string names. Lot’s of things could go wrong there, and we don’t get any warnings. The situation has not gone worse, but still lot’s of things can go wrong. So the PropertyInfo method needs some parameter validation. Also, properties certainly aren’t the only thing we can reflect upon. What about fields, methods and constructors? Here’s a full implementation:

using System;
using System.Linq.Expressions;
using System.Reflection;

public static class StaticReflectorExtensions
{
    public static PropertyInfo PropertyInfo<T, U>(this IStaticReflector<T> obj, Expression<Func<T, U>> selector)
    {
        if (selector == null)
        {
            throw new ArgumentNullException(Strings.Selector);
        }

        PropertyInfo pi = obj.MemberInfo(selector) as PropertyInfo;

        if (pi == null)
        {
            throw new ArgumentException(Strings.InvalidPropertySelector, Strings.Selector);
        }

        return pi;
    }

    public static FieldInfo FieldInfo<T, U>(this IStaticReflector<T> obj, Expression<Func<T, U>> selector)
    {
        if (selector == null)
        {
            throw new ArgumentNullException(Strings.Selector);
        }

        FieldInfo fi = obj.MemberInfo(selector) as FieldInfo;

        if (fi == null)
        {
            throw new ArgumentException(Strings.InvalidFieldSelector, Strings.Selector);
        }

        return fi;
    }

    public static MemberInfo MemberInfo<T, U>(this IStaticReflector<T> obj, Expression<Func<T, U>> selector)
    {
        if (selector == null)
        {
            throw new ArgumentNullException(Strings.Selector);
        }

        var body = selector.Body as MemberExpression;

        if (body == null)
        {
            throw new ArgumentException(Strings.InvalidMemberSelector, Strings.Selector);
        }

        if (body.Expression.NodeType != ExpressionType.Parameter)
        {
            throw new ArgumentException(Strings.InvalidMemberSelector, Strings.Selector);
        }

        return body.Member;
    }

    public static MethodInfo MethodInfo<T, U>(this IStaticReflector<T> obj, Expression<Func<T, U>> selector)
    {
        if (selector == null)
        {
            throw new ArgumentNullException(Strings.Selector);
        }

        var body = selector.Body as MethodCallExpression;

        if (body == null)
        {
            throw new ArgumentException(Strings.InvalidMethodSelector, Strings.Selector);
        }

        // instance methods must be called on the parameter
        if (body.Object != null && body.Object.NodeType != ExpressionType.Parameter)
        {
            throw new ArgumentException(Strings.InvalidMethodSelector, Strings.Selector);
        }

        // static methods must be defined in the type of the parameter or a base type
        if (body.Object == null && !body.Method.DeclaringType.IsAssignableFrom(typeof(T)))
        {
            throw new ArgumentException(Strings.InvalidMethodSelector, Strings.Selector);
        }

        return body.Method;
    }

    public static MethodInfo MethodInfo<T>(this IStaticReflector<T> obj, Expression<Action<T>> selector)
    {
        if (selector == null)
        {
            throw new ArgumentNullException(Strings.Selector);
        }

        var body = selector.Body as MethodCallExpression;

        if (body == null)
        {
            throw new ArgumentException(Strings.InvalidMethodSelector, Strings.Selector);
        }

        // instance methods must be called on the parameter
        if (body.Object != null && body.Object.NodeType != ExpressionType.Parameter)
        {
            throw new ArgumentException(Strings.InvalidMethodSelector, Strings.Selector);
        }

        // static methods must be defined in the type of the parameter or a base type
        if (body.Object == null && !body.Method.DeclaringType.IsAssignableFrom(typeof(T)))
        {
            throw new ArgumentException(Strings.InvalidMethodSelector, Strings.Selector);
        }

        return body.Method;
    }

    public static ConstructorInfo ConstructorInfo<T>(this IStaticReflector<T> obj, Expression<Func<T>> selector)
    {
        if (selector == null)
        {
            throw new ArgumentNullException(Strings.Selector);
        }

        var body = selector.Body as NewExpression;

        if (body == null)
        {
            throw new ArgumentException(Strings.InvalidConstructorSelector, Strings.Selector);
        }

        return body.Constructor;
    }

    private static class Strings
    {
        internal const string InvalidFieldSelector = "Invalid field selector";
        internal const string InvalidPropertySelector = "Invalid property selector";
        internal const string InvalidMemberSelector = "Invalid member selector";
        internal const string InvalidMethodSelector = "Invalid method selector";
        internal const string InvalidConstructorSelector = "Invalid constructor selector";
        internal const string Selector = "selector";
    }
}


Next time, we’ll talk about the disadvantages of this approach, and we’ll look at an alternative.

Paint.NET Effects updated

It's been a while since I've been blogging. I've been very busy with several projects, professionally with U2U Consult as well as personally. Anyway, I have decided to spend a bit more time on this again, and here's the first result.

I've updated my Paint.NET effects. Version 3.3 is a performance update mainly. All effects and adjustments have been optimized, with some spectacular results on the popular Drop Shadow effect. Depending on the scenario (i.e. image and parameter values), I have seen this effect to be more than a hundred times faster!

As a result, I have been able to increase the range for values of the offset, blur and widening parameters. Several users have requested a shadow opacity parameter, so that has been added as well.

Note that the Drop Shadow effect has moved to the Object effect menu, as requested by several users on the Paint.NET forum. And in case you haven't heard: Paint.NET is free image and photo editing software for Windows computers.

Download the effects from my download page.

Enjoy!

Technorati tags: , , , , ,

StyleCop for C# released

There have been rumors for years about a tool called StyleCop, used internally within Microsoft. According to the rumors, it was comparable to FxCop (Code Analysis), but would do its job at the source level (instead of the IL level used by FxCop). That way it would be able to check consistency of code style, you know, where to put spaces and comments and line breaks and stuff.

StyleCop has finally been released, and it turns out the rumors were true. It's a Visual Studio Add-In, that sits nicely in the project menu, right below Code Analysis.

Source Analysis

As you can imagine, the rules caused a lot of debate. The thing is, everybody can understand that it's a good idea not to declare protected members in sealed types (for example), but matters of style can't be debated rationally. After all, it's a matter of taste, or is it not?

As I've said before, "I hate it when developers have to make choices like that during routine development. Choosing takes time, and that's not likely to improve productivity. But much worse is the fact that different developers will make different choices. Even a single developer may make different choices from one day to the next. That leads to inconsistencies in the code. Developers will spend more time trying to understand the code they're reading, because it doesn't always follow the same pattern. That's bad for productivity. In the worst case scenario, developers start rewriting each other's code, just so it matches their choice of the day. That kills productivity."

So no, it's not a matter of style, it's all about productivity. What your standard is doesn't matter, what matters is that you have a standard, and that people follow it without wasting time.

So naturally, I took Source Analysis for a test drive on a bunch of code I have written. First impression: lots and lots of warnings! But many do return, so I made just a few setting changes:

Microsoft Source Analysis Project Settings

Only a handful of warnings remained, and to be honest, they had a point. It wasn't much, but my code improved thanks to this tool. And this was just my own code. The real value of a tool like this lies in the consistency it can bring to team projects, ending all pointless debates and holy wars about personal preferences.

The XML based file headers are clearly a Microsoft internal thing. But hey, if you want a copyright notice in every file, you might just as well do it this way. Remember, having a standard is important, which one it is doesn't matter (much).

Conclusion: very good addition to the toolbox, highly recommended.