Cook Computing

Implementing an XML-RPC Service with ASP.NET MVC

April 1, 2012 Written by Charles Cook

I received a couple of emails recently asking how to implement an XML-RPC service in an ASP.NET MVC application. In case anyone is interested this is how to do it (this is an expanded version of an earlier post).

Define an interface for your XML-RPC service, for example:

using CookComputing.XmlRpc;

public interface IStateName
{
  [XmlRpcMethod("examples.getStateName")]
  string GetStateName(int stateNumber);
}

Implement the service:

using CookComputing.XmlRpc;

public class StateNameService : XmlRpcService, IStateName
{
  public string GetStateName(int stateNumber)
  {
    if (stateNumber < 1 || stateNumber > m_stateNames.Length)
      throw new XmlRpcFaultException(1, "Invalid state number");
    return m_stateNames[stateNumber - 1];
  }

  string[] m_stateNames
    = { "Alabama", "Alaska", "Arizona", "Arkansas",
        "California", "Colorado", "Connecticut", "Delaware", "Florida",
        "Georgia", "Hawaii", "Idaho", "Illinois", "Indiana", "Iowa", 
        "Kansas", "Kentucky", "Lousiana", "Maine", "Maryland", "Massachusetts",
        "Michigan", "Minnesota", "Mississipi", "Missouri", "Montana",
        "Nebraska", "Nevada", "New Hampshire", "New Jersey", "New Mexico", 
        "New York", "North Carolina", "North Dakota", "Ohio", "Oklahoma",
        "Oregon", "Pennsylviania", "Rhose Island", "South Carolina", 
        "South Dakota", "Tennessee", "Texas", "Utah", "Vermont", "Virginia", 
        "Washington", "West Virginia", "Wisconsin", "Wyoming" };
}

Implement a custom route handler:

using System.Web;
using System.Web.Routing;

public class StateNameRouteHandler : IRouteHandler
{
  public IHttpHandler GetHttpHandler(RequestContext requestContext)
  {
    return new StateNameService();
  }
}

Register the custom route in global.asax.cs:

public static void RegisterRoutes(RouteCollection routes)
{
  routes.IgnoreRoute("{resource}.axd/{*pathInfo}");

  routes.Add(new Route("api/statename", new StateNameRouteHandler()));

  // ...

}

Check that everything is working by pointing your browser to the url for the handler, for example something like http://localhost:33821/api/statename in this case when running from Visual Studio. You should then see an automatically generated help page for the service. If this is ok then point your XML-RPC client to the service and start making calls.


Faking In Visual Studio 11

March 10, 2012 Written by Charles Cook

The Problem

Dealing with Now and why I'm almost done with C# and Java — Karl Seguin's post about how the way you code is a by-product of the progamming language you use — discusses how difficult it is to test unit code like this in C#:

var audit = new Audit
{   
    UserName = user.Name,    
    Dated = DateTime.Now,    
    //...
};

The following test might work or not, depending on how quickly the code runs:

public void ItSetsTheAuditTimeToRightNow()
{    
    var audit = CreateAuditItem(new User{Name = "Leto"});  
    audit.Dated.ShouldEqual(DateTime.Now);}   
}

Unfortunately it's difficult to mock DateTime.Now because it's a static method, which leads to more complicated solutions such as injecting an abstract dependency into objects or using a C# lambda such as Ayende's SystemTime class:

public static class SystemTime
{
    public static Func<DateTime> Now = () => DateTime.Now;
}

This allows you to write test code like this:

SystemTime.Now = () => new DateTime(2000,1,1);
repository.ResetFailures(failedMsgs); 
SystemTime.Now = () => new DateTime(2000,1,2);
var msgs = repository.GetAllReadyMessages(); 
Assert.AreEqual(2, msgs.Length);

Another solution mentioned in the post's comments is to use NUnit's Within:

Assert.That(audit.Dated, Is.EqualTo(DateTime.Now).Within(new TimeSpan(0, 0, 0, 0, 50)));

This works, sort of, but isn't a general solution to this type of problem.

Visual Studio 11 Fakes

Some of the comments mention TypeMock Isolator and the Moles project from Microsoft, and it so happens the Visual Studio 11 beta reveals that Moles has been productized into Visual Studio as the Fakes Framework. This can inject two types of dummy implementation into unit tests: stub types for interfaces and overridable methods, and shim types for static and non-overridables methods:

Stub types Stub types make it easy to test code that consumes interfaces or non-sealed classes with overridable methods. A stub of the type T provides a default implementation of each virtual member of T, that is, any non-sealed virtual or abstract method, property, or event. The default behavior can be dynamically customized for each member by attaching a delegate to a corresponding property of the stub. A stub is realized by a distinct type which is generated by the Fakes Framework. As a result, all stubs are strongly typed.

Although stub types can be generated for interfaces and non-sealed classes with overridable methods, they cannot be used for static or non-overridable methods. To address these cases, the Fakes Framework also generates shim types.

Shim types Shim types allow detouring of hard-coded dependencies on static or non-overridable methods. A shim of type T can provide an alternative implementation for each non-abstract member of T. The Fakes Framework will redirect method calls to members of T to the alternative shim implementation. The shim types rely on runtime code rewriting that is provided by a custom profiler.

Delegates Both stub types and shim types allow you to use delegates to dynamically customize the behavior of individual stub members.

Faking DateTime

To test the DateTime code, create a unit test project and right click on one of the referenced assemblies in Solution Explorer. This displays a context menu which has an "Add Fakes Assembly". Select this and two more referenced assemblies are automatically added to the project:

  • Microsoft.QualityTools.Testing.Fakes
  • Microsoft.VisualStudio.QualityTools.UnitTestFramework.10.0.0.0.Fakes.

Visual Studio will automatically generate a file called Microsoft.VisualStudio.QualityTools.UnitTestFramework.fakes in a directory in the project called Fakes. This XML file is used to configure the assembly for which fakes are generated and the namespaces and types that are included. We want to generate a shim type for DateTime so we can change the file to specify the mscorlib assembly:

<Fakes xmlns="http://schemas.microsoft.com/fakes/2011/">
  <Assembly Name="mscorlib" />
</Fakes>

Building the project results in Visual Studio creating an assembly containing the fake types in the FakesAssemblies directory. We need to then add a reference to this assembly so we can use the fake types in our test.

So, say we have this code under test:

public class Audit
{
    public string UserName { get; set; }
    public DateTime Dated { get; set; }
}

public static class TestClass
{
    public static Audit CreateAuditItem(string userName)
    {
        var audit = new Audit { UserName = userName, Dated = DateTime.Now };
        return audit;
    }
}

We can now write this unit test:

using System;
using System.Fakes;
using Microsoft.QualityTools.Testing.Fakes;
using Microsoft.VisualStudio.TestTools.UnitTesting;

[TestClass]
public class AuditTests
{
    [TestMethod]
    public void ItSetsTheAuditTimeToRightNow()
    {
        // wrap the test code in a ShimsContext to control 
        // the lifetime of your shims
        using (ShimsContext.Create())
        {
            // hook delegate to the shim method to redirect 
            // DateTime.Now to return January 1st of 2000
            ShimDateTime.NowGet = () => new DateTime(2000, 1, 1);

            var audit = TestClass.CreateAuditItem("Leto");
            Assert.AreEqual(audit.Dated, (DateTime.Now));
        }
    }
}

Calling ShimsContext.Create() within the context of a using statement means that the shim will be de-registered before the test function exits. If this is not done the shim will remain active for subsequent test and so might cause them to run in an unexpected way.

Although this doesn't address the limitations of C# that Karl described in his post (assuming you see them as limitations), at least we're not having to modify the original code using DateTime.Now to make it testable. Once the fakes are configured in a test project writing tests using the fakes is straightforward.

Testing Time

While Karl's post was not really about the problem of mocking DateTime.Now in itself, it is a good example of a non-deterministic test that can cause problems. Martin Fowler blogged about this type of test in his post Eradicating Non-Determinism in Tests, in particular the issues associated with testing time related functionality:

Few things are more non-deterministic than a call to the system clock. Each time you call it, you get a new result, and any tests that depend on it can thus change. Ask for all the todos due in the next hour, and you regularly get a different answer.

The most important thing here is to ensure that you always wrap the system clock with routines that can be replaced with a seeded value for testing. A clock stub can be set to particular time and frozen at that time, allowing your tests to have complete control over its movements. That way you can synchronize your test data to the values in the seeded clock.

Always wrap the system clock, so it can be easily substituted for testing.

One thing to watch with this, is that eventually your test data might start having problems because it's too old, and you get conflicts with other time based factors in your application. In this case you can move the data, and your clock seeds to new values. When you do this, ensure that this is the only thing you do. That way you can be sure that any tests that fail are due to time-movement in the test data.

Another area where time can be a problem is when you rely on other behaviors from the clock. I once saw a system that generated random keys based on clock values. This systems started failing when it was moved to a faster machine that could allocate multiple ids within a single clock tick.

I've heard so many problems due to direct calls to the system clock that I'd argue for finding a way to use code analysis to detect any direct calls to the system clock and failing the build right there. Even a simple regex check might save you a frustrating debugging session after a call at an ungodly hour.


Visual Studio - Function Return Value In Debugger

February 29, 2012 Written by Charles Cook

While preparing for job interviews after I arrived in the US I rehearsed questions such as — how would you improve your favourite programming language — what new features would you like in your favourite IDE — and so on. As it happened I never got asked any of these questions but browsing through the Visual Studio UserVoice site I noticed that one of my desired Visual Studio enhancements is under consideration but won't make it into Visual Studio 11. This is Function return value in debugger. I've never liked having to modify code to be able to see the value being returned from a function, for example changing code like this:

string Foo()
{
    /// ...

    return Bar();
}

So that a local variable can be used to watch the return value from the call to Bar():

string Foo()
{
    /// ...

    string ret = Bar();
    return ret;
}

One of the site admins added a comment:

For those out there who have experience debugging native C++ or VB6 code, you may have used a feature where function return values are provided for you in the Autos window. Unfortunately, this functionality does not exist for managed code. While you can work around this issue by assigning the return values to a local variable, this is not as convenient because it requires modifying your code.

In managed code, it’s a lot trickier to determine what the return value of a function you’ve stepped over. We realized that we couldn’t do the right thing consistently here and so we removed the feature rather than give you incorrect results in the debugger. However, we want to bring this back for you and our CLR and Debugger teams are looking at a number potential solutions to this problem. Unfortunately this is will not be part of Visual Studio 11.

Oh well, back to using local variables for the time being, even if they result in code review comments such as "Remove unnecessary variable".


Unit Testing With ExpectedException

January 8, 2012 Written by Charles Cook

@mentalguy must have been having a frustrating day:

Mental Guy tweet

I've come across this problem with .NET code. Often a single exception type will cover several different error conditions and so when writing the corresponding unit tests it's tempting to assert on the exception's Message property. Of course this is bad because it assumes the text of the message won't be changed.

NUnit

NUnit encourages the checking of exception messages when using the ExpectedException attribute, for example

[ExpectedException(typeof(ArgumentException), ExpectedMessage="expected message" )]

but suggests that this is not good practice by allowing you to match on a substring or a regular expression:

public enum MessageMatch
{
  /// Expect an exact match
  Exact,    
  /// Expect a message containing the parameter string
  Contains,
  /// Match the regular expression provided as a parameter
  Regex
}

For example, this is for a test that passes only if an ArgumentException with a message containing "unspecified" is received.

[ExpectedException(typeof(ArgumentException), 
  ExpectedMessage="unspecified", MatchType=MessageMatch.Contains)]
public void TestMethod()
{
    ...
}

It's better to have some way of specifying an error code which can be checked independently of the message, for example say we have an exception base class supporting an error code from which we derive custom exception classes:

public class ExceptionBase : Exception
{
  public ExceptionBase(int errorCode, string message)
    : this(errorCode, message, null)
  {
  }

  public ExceptionBase(int errorCode, string message, Exception innerException)
    : base(message, innerException)
  {
    ErrorCode = errorCode;
  }

  public int ErrorCode { get; private set; }
}

public class MessageException : ExceptionBase
{
  public MessageException(int errorCode, string message)
  : base(errorCode, message)
  {
  }

  public MessageException(int errorCode, string message, Exception innerException)
  : base(errorCode, message, innerException)
  {
  }
}

We can test for exceptions with a particular error code like this:

[TestFixture]
public class Tests
{
  [Test]
  public void Test1()
  {
    MessageException ex = Assert.Throws<MessageException>(() => 
      {
        // code which is expected to throw the exception
        // ...
      }
    );
    Assert.AreEqual(ErrorCodes.HEADER_MISSING, ex.ErrorCode); 
  }
}

Visual Studio

Visual Studio does the right thing and doesn't have a built-in way of checking the exception message. You can even use your own attributes derived from its ExpectedExceptionBaseAttribute class. This allows us to implement an attribute which can be used to check the error code in exception classes derived from the above ExceptionBase class:

[AttributeUsage(AttributeTargets.Method, AllowMultiple = false, Inherited = true)]
public sealed class ExpectedExceptionWithErrorCode : ExpectedExceptionBaseAttribute
{
  public Type ExpectedException { get; set; }
  public int ExpectedErrorCode { get; set; }

  public ExpectedExceptionWithErrorCode(Type expectedException)
    : this(expectedException, 0, "")
  {
  }

  public ExpectedExceptionWithErrorCode(Type expectedException,
    int errorCode)
    : this(expectedException, errorCode, "")
  {
  }

  public ExpectedExceptionWithErrorCode(Type expectedException,
    int errorCode, string noExceptionMessage)
    : base(noExceptionMessage)
  {
    if (expectedException == null)
      throw new ArgumentNullException("exceptionType");
    if (!typeof(Exception).IsAssignableFrom(expectedException))
      throw new ArgumentException("Expected exception type must be "
          + "System.Exception or derived from System.Exception.", 
        "expectedException");
    ExpectedException = expectedException;
    ExpectedErrorCode = errorCode;
  }

  protected override void Verify(Exception exception)
  {
    if (exception.GetType() != ExpectedException)
    {
      base.RethrowIfAssertException(exception);
      string msg = string.Format("Test method {0}.{1} "
          + "threw exception {2} but {3} was expected.",
        base.TestContext.FullyQualifiedTestClassName, base.TestContext.TestName,
        exception.GetType().FullName, ExpectedException.FullName);
      throw new Exception(msg);
    }
    ExceptionBase ex = exception as ExceptionBase;
    if (ex.ErrorCode != ExpectedErrorCode)
    {
      string msg = string.Format("Test method {0}.{1} threw expected "
          + "exception {2} with error code {4} but error code {5} was expected.",
        base.TestContext.FullyQualifiedTestClassName, base.TestContext.TestName,
        exception.GetType().FullName, ExpectedException.FullName,
        ex.ErrorCode, ExpectedErrorCode);
      throw new Exception(msg);
    }
  }
}

This can be used like this:

[TestClass]
public class Tests
{
  [TestMethod]
  [ExpectedExceptionWithErrorCode(typeof(MessageException), 
    ExpectedErrorCode = ErrorCodes.INVALID_HEADER)]
  public void Test1()
  {
    // code which is expected to throw the exception
    // ...
  }
}

More on Collection Initializers

December 3, 2011 Written by Charles Cook

While looking at collection initializers I came across mucking about with hashes... and more mucking about with hashes by Alex Henderson (via Phil Haack). In these posts he describes ways of creating an instance of a string-keyed Dictionary in a Ruby-like way using lambda expressions, for example:

Dictionary<string, string> items = Hash(Age => "10", Height => "20");

The second post contains the fastest implementation (contributed by Andrey Shcekin):

public Dictionary<string, T> Hash<T>(params Func<string, T>[] args)
where T : class  
{
    var items = new Dictionary<string, T>();
    foreach (var func in args)
    {
        var item = func(null);
        items.Add(func.Method.GetParameters()[0].Name, item);
    }
    return items;
}

[Comment: I don't see why the class constraint is necessary]

This led me to think that I could implement an Add extension method which would allow a collection initializer to be used, for example like this:

var items = new Dictionary<string, string> {  Name => "alex",  Age => "10" };

I came up with this:

static class Extensions
{
  public static void Add<T>(this Dictionary<string, T> dict, Func<string, T> args)
  {
    var item = args(null);
    dict.Add(args.Method.GetParameters()[0].Name, item);
  }
}

Unfortunately, although this works fine when Add is invoked explicitly, it fails to compile when attempting to use it in a collection initializer. The compiler only recognizes member functions called Add, not extension methods. C# PM Alex Turner wrote on Connect:

You're right that the spec is ambiguous here! There was no explicit decision in C# 3.0 to prevent this from working; it was simply an implementation limitation we accepted when we discovered it late in the product cycle. We see no design reason not to add this now, but unfortunately, we are starting to lock down on our feature set and won't be able to get to this feature for this release. We'll definitely keep this in mind when planning for our next release!

...I've added a vote for collection initializers binding to Add extension methods to the OneNote notebook we use internally to track C# compiler and language requests. We can't promise if or when we'll get to this feature, but we'll definitely keep it in mind during planning for the next release!

I tried this with the .NET 4.5 Developer Preview and it still fails to build, but of course this may change before the final release.

On the other hand, VB does support the use of extension methods called Add in collection initializers but its syntax for lambda expressions rather misses the point of the exercise:

Imports System.Runtime.CompilerServices

Module Sample
    Sub Main()
        Dim dict As New Dictionary(Of String, Integer) _
            From { Function(Age As String) 10, Function(Height As String) 20 }
    End Sub

    <Extension()>
    Sub Add(Of T)(ByVal dict As Dictionary(Of String, T), 
                  ByVal args As Func(Of String, T))
        Dim item As T
        item = args(Nothing)
        dict.Add(args.Method.GetParameters()(0).Name, item)
    End Sub
End Module

infoof and Collection Initializers

November 27, 2011 Written by Charles Cook

While thinking about which features I would like added to C# I came across an old post by Eric Lippert — In Foof We Trust: A Dialogue — in which he presents an imaginary dialog between himself and a C# user who would like an operator or operators similar to typeof() but which would take the name of a method, field, or property instead of a type name. I mentioned something similar in old posts here and here. This feature would certain reduce the need to hard-code method/property names or use lambda expressions, for example in implementations of INotifyPropertyChanged.PropertyChanged, and would help refactoring tools. It was interesting to read:

I agree, that would be a lovely sugar. This idea has been coming up during the C# design meeting for almost a decade now. We call the proposed operator “infoof”, since it would return arbitrary metadata info. We whimsically insist that it be pronounced “in-foof”, not “info-of”.

He goes on to describe that there are design issues which make this much more costly to implement and test than it appears at first sight and that there are always budgetary constraints on which new features his team can deliver. He sums up:

It’s an awesome feature that pretty much everyone involved in the design process wishes we could do, but there are good practical reasons why we choose not to. If there comes a day when designing it and implementing it is the best way we could spend our limited budget, we’ll do it. Until then, use Reflection.

Looks like that one will remain on the wishlist but it turns out that one other feature I would like has already been implemented for some time now: collection initializers for dictionaries. For example, you can write this:

Dictionary<int, string> hashtable = new Dictionary<int, string>
      { { 1, "one" }, {2, "two" }, {3, "three" } };

This is a particular case of using a collection initializer which turns out to be a fairly complicated, not to say contrived, language feature. If your type implements IEnumerable — presumably a sanity check that it represents some sort of collection — and has one or more methods called Add(), the compiler will map the collection initializer expression onto calls to the Add() function(s). Mads Torgerson wrote about this in 2006: What is a collection?


Product Packaging Design Exercise

November 27, 2011 Written by Charles Cook

I just came across an interesting exercise in analysing the design of product packaging. Design consultancy Antrepo took several well-known brands and progressively stripped away the packaging detail on each of them. Along with several of the people commenting on the post I mostly prefer the third versions of each product. These are not as busy as the first and second but still retain something of the brand identity. They might be too simple in the long run though — perhaps the richness of the original versions is required to retain interest when you are exposed to the packaging over and over again. Or maybe I'm simply atypical of the target market.


ArraySegment and SubArray

November 24, 2011 Written by Charles Cook

ArraySegment

I was looking at an algorithm which involves processing segments of an array recursively and I thought the code would be neater if instead of passing the array plus the offset and length of each segment I could use an array data type which provides a view onto a segment of an array. The ArraySegment type appeared in search results and it's name sounded promising. I'd not heard of it before but it's been around since .NET 2.0 and is used in various types including Socket and LogRecordSequence. Unfortunately it is essentially just a way of specifying a segment and doesn't provide any methods to access the data within the segment:

public struct ArraySegment<T>
{
  public ArraySegment(T[] array);
  public ArraySegment(T[] array, int offset, int count);
  public static bool operator !=(ArraySegment<T> a, ArraySegment<T> b);
  public static bool operator ==(ArraySegment<T> a, ArraySegment<T> b);
  public T[] Array { get; }
  public int Count { get; }
  public int Offset { get; }
  public bool Equals(ArraySegment<T> obj);
  public override bool Equals(object obj);
  public override int GetHashCode();
}

It also doesn't have a constructor which takes an ArraySegment so making it awkward to use for recursive algorithms, but it's a ready-made class if you need to specify segments of an array without making copies of the array, as in this example.

SubArray

At first I thought it would be nice if SubArray was an array type but I quickly realized you cannot derive from System.Array. According to MSDN:

The Array class is the base class for language implementations that support arrays. However, only the system and compilers can derive explicitly from the Array class. Users should employ the array constructs provided by the language.

So it's not possible to create an array type which represents a sub-array. The solution was to implement a type which derives from IList<T>. This provides methods for indexing and enumerating the sub-array. The Java interface List has a method called subList which provides some prior art for this:

Returns a view of the portion of this list between the specified fromIndex, inclusive, and toIndex, exclusive. (If fromIndex and toIndex are equal, the returned list is empty.) The returned list is backed by this list, so non-structural changes in the returned list are reflected in this list, and vice-versa. The returned list supports all of the optional list operations supported by this list...

...The semantics of the list returned by this method become undefined if the backing list (i.e., this list) is structurally modified in any way other than via the returned list. (Structural modifications are those that change the size of this list, or otherwise perturb it in such a fashion that iterations in progress may yield incorrect results.)

In the case of the new SubArray type we'll assume the underlying collection is an array, i.e. of fixed length, so the methods of IList<T> which change the length of the list will throw NotSupportedException. We will also allow an instance of a SubArray to be created as an offset into an existing SubArray (with the Offset property of the "sub-SubArray" referring to the original array and not to its parent SubArray). Finally as with ArraySegment the SubArray type will be a struct. Throw in some extension methods to make it easier to use and came up with this: SubArray.cs

Example Usage

Reconstruct a binary tree from arrays containing the inorder and preorder traversal of the tree (assuming no duplicates).

public void Sample()
{
  int[] preorder = { 7, 10, 4, 3, 1, 2, 8, 11 };
  int[] inorder = { 4, 10, 3, 1, 7, 11, 8, 2 };
  Node head = ReconstructTree(preorder, inorder);
}

Node ReconstructTree(int[] preorder, int[] inorder)
{
  // use a hashtable to retrieve position of sub-root node from inorder array 
  var inorderMapTable = new Dictionary<int, int>(preorder.Length);
  for (int i = 0; i < inorder.Length; i++)
    inorderMapTable[inorder[i]] = i;
  return ReconstructTreeHelper(preorder.SubArray(), inorder.SubArray(), inorderMapTable);
}

Node ReconstructTreeHelper(SubArray<int> preorder, SubArray<int> inorder,
  Dictionary<int, int> inorderMapTable)
{
  int rootVal = preorder[0];
  int rootPos = inorderMapTable[rootVal] - inorder.Offset;
  Node node = new Node { Id = rootVal };
  int nodesToLeft = rootPos;
  if (nodesToLeft > 0)
    node.Left = ReconstructTreeHelper(preorder.SubArray(1, nodesToLeft),
      inorder.SubArray(0, nodesToLeft), inorderMapTable);
  int nodesToRight = inorder.Count - rootPos - 1;
  if (nodesToRight > 0)
    node.Right = ReconstructTreeHelper(preorder.SubArray(rootPos + 1, nodesToRight),
      inorder.SubArray(rootPos + 1, nodesToRight), inorderMapTable);
  return node;
}

The expected result is:

        _______7______
       /              \
    __10__          ___2
   /      \        /
   4       3      _8
            \    /
             1  11

Postscript

It turns out that ArraySegment has richer functionality in .NET Framework 4.5 :

[SerializableAttribute]
public struct ArraySegment<T> : IList<T>,   ICollection<T>, IReadOnlyList<T>, 
    IEnumerable<T>, IEnumerable

It still doesn't provide a way of creating a ArraySegment as a sub-array of an existing ArraySegment but it would be trivial to supply an extension method to add this functionality.


F# Type Inference

October 25, 2011 Written by Charles Cook

F# Type inference might just be syntax sugar as Brian McNamara puts it in his post Overview of type inference in F#, but it's sweetness is very appealing. I thought this when I saw slide 12 in Tomas Petricek's and Phil Trelford's presentation Turning to the Functional Side With F#. Two C# functions:

public static IEnumerable<R> Map<T, R>(this IEnumerable<T> xs, Func<T, R> f)
{
    foreach (var x in xs)
        yield return f(x);
}

public static R Reduce<T, R>(this IEnumerable<T> xs, R init, Func<R, T, R> )f
{
    var current = init;
    foreach (var x in xs)
        current = f(current, x);
    return current;
}

and their F# equivalents:

let map f xs = seq {
    for x in xs do 
        yield f x 
    }

let reduce f init items =
    let mutable current = init
    for item in items do
        current <- f current item
    current

Brian's post is a good overview of how F# type inference works. He illustrates how type inference in F# gives the language an edge over C#:

Though type inference is “just” syntax sugar, it really can matter; there are cases where you’d never write cool functional programming code in C# because you get completely swamped under by the type annotations. As an example see here; one of the C# functions in that example is this monstrosity:

static Tree<KeyValuePair<A, bool>> DiffTree<A>(Tree<A> tree, Tree<A> tree2)
{
    return XFoldTree((A x, Func<Tree<A>, Tree<KeyValuePair<A, bool>>> l, Func<Tree<A>,
      Tree<KeyValuePair<A, bool>>> r,  Tree<A> t) => (Tree<A> t2) =>
        Node(new KeyValuePair<A, bool>(t2.Data, object.ReferenceEquals(t, t2)),
             l(t2.Left), r(t2.Right)),
        x => y => null, tree)(tree2);
}

and yes, every single one of those type annotations is required by C# 4.0. Here’s the corresponding F# code that uses the same identifier names:

let DiffTree(tree,tree2) =
    XFoldTree (fun x l r t t2 ->
        let (Node(x2,l2,r2)) = t2
        Node((x2,t===t2), l l2, r r2)) (fun _ _ -> Leaf) tree tree2

I've had discussions relating to the use of var in C# where people say that you need the type annotations to be able to understand code when reading it but I'm not convinced. With type inference it seems like you're seeing the code at a higher level, making it easier to understand without so much clutter from implementation details, but I haven't worked on a large scale project using a language with type inference so I can't really say for sure one way or the other.


FlexDump

September 23, 2011 Written by Charles Cook

As I sort out things for the move to the US I'm discovering various artefacts from my career as a software developer. These have been stored unseen in boxes in the loft for many years so it is quite a nostalgia trip to see them again. For example, I came across a printed README file for FlexDump, a hexadecimal record viewer/editor I wrote as my first non-trivial Windows application:

FlexDump is a programmer's tool for working on files containing fixed length records. Unlike a typical hex editor where are offsets are from the beginning of the file, FlexDump displays a single record at a time with offsets calculated from the beginning of each record. Thus no more does the programmer have to keep converting field offsets and record number into an offset from the start of the file

This was a good example of writing some software to scratch a personal itch[1] — at work then we were using C-ISAM which stores its data in fixed length records and I needed a way of checking the raw data. At some point we moved to an open-source implementation of C-ISAM called D-ISAM[2] and I spent some time hacking on this to clear up some of its bugs, a useful experience in learning the value of open software. Implementing FlexDump was also a good way of learning an up-and-coming technology — this was the early 1990s when Windows 3 was just beginning to take off.

The README contains a screen snapshot (though I now notice that the file being viewed didn't contain fixed length records, so wasn't a good example; I should have spotted that at the time):

I developed FlexDump on a 386SX[3] machine which cost me around £2000, not much performance for a lot of money in those days. The README mentions the performance:

FlexDump is written in C++ using Borland Turbo C++ version 3.0. However, it does not use Object Windows and could be ported to Microsoft C++ 7.0 without any problems. Turbo C++has been used because it's speed of use of its development environment, necessary because of the rather slow hardware used — a 16 MHz 386SX with 2M memory.

It's difficult to imagine using a machine with only 2Mbyte of memory. I think it also only had a 20Mbyte hard disk.

The README also mentions that FlexDump was going to be released as shareware but I demonstrated it in an interview at Uniplex and got a job as a Windows developer working on the onGo Office project, and so moved onto better things as a Windows developer.


[1] As in the first of Eric Raymond's guidelines for creating good open source software, described in his essay The Cathedral and the Bazaar:

Every good work of software starts by scratching a developer's personal itch.

[2] D-ISAM seems to have survived in this product, via at least one rewrite.

[3] The 386SX had a 32-bit internal architecture but used a 16-bit data bus to reduce the cost of the circuit board.


Anders Hejlsberg Session on C#/VB Future Directions

September 18, 2011 Written by Charles Cook

I'm working my way through some of the more interesting Build conference sessions and this morning I watched the Future directions for C# and Visual Basic talk by Anders Hejlsberg. He's a very good presenter and the demos all worked unlike those in some of the other sessions I've watched. The section on the Roslyn "compiler as a service" project had some very cool demos involving smart refactoring, writing C# code via an interactive window, complete with intellisense, and pasting VB code into a C# file and seeing the code automatically transformed to C#. Well worth watching as a glimpse of the possibilities that will be opened up for IDE tools in the future.

The section on asynchronous programming was a recap of previous talks over the last year but it inspired me to write my first Metro app to investigate how the new async support in C# 5 will make it easier to call multiple asynchronous calls. The scenario I came across in a recent Silverlight project was having to display a busy indicator (now provided by the Windows Runtime ProgressRing class) while asynchronous calls are being made and always ensure the busy indicator was switched off when the calls finished, regardless of whether one of them failed. The pattern used to implement this resulted in code with something like the following (simplified) structure:

private void Button_Click(object sender, RoutedEventArgs e)
{
    BusyIndicator.IsActive = true;
    try
    {
        RemoteCall1((result1) =>
        {
            try
            {
                // ... process result of first call
                RemoteCall2((result2) =>
                {
                    try
                    {
                        // ... process result of second call
                    }
                    catch (Exception ex)
                    {
                        StatusText.Text = ex.Message;
                    }
                    finally
                    {
                        BusyIndicator.IsActive = false;
                    }
                });
            }
            catch (Exception)
            {
                StatusText.Text = ex.Message;
                BusyIndicator.IsActive = false;
            }
        });
    }
    catch (Exception)
    {
        StatusText.Text = ex.Message;
        BusyIndicator.IsActive = false;
    }
}

The code is ugly and error-prone, for example a bug which occurred more than once was omitting to switch off the busy indicator in the event of an exception being thrown. The async support in C# transforms this into something much simpler (assuming C#5-style async versions of the remote calls are available):

async void Button_Click(object sender, RoutedEventArgs e)
{
      try
      {
            BusyIndicator.IsActive = true;
            var result1 = await RemoteCall1Async();
            // ... process result of first call
            var result2 = await RemoteCall2Async();
            // ... process result of second call
            StatusText.Text = "completed successfully";
      }
      catch (Exception ex)
      {
            StatusText.Text = ex.Message;
      }
      finally
      {
            this.BusyIndicator.IsActive = false;
      }
}

Enumerable.SequenceEqual

September 11, 2011 Written by Charles Cook

Enumerable.SequenceEqual() is another useful Linq extension method when working with arrays, for example arrays of type byte[]:

byte[] passwordHash = GetPasswordHash(password, salt);
byte[] streamPasswordHash = stream.ReadBytes(32);
if (!passwordHash.SequenceEqual(streamPasswordHash))
  throw new Exception("Wrong password");

Though it's not going to be particularly efficient if it's not optimized for the array case, as is the case with the Mono implementation:

public static bool SequenceEqual<TSource> (this IEnumerable<TSource> first, 
  IEnumerable<TSource> second, IEqualityComparer<TSource> comparer)
{
    Check.FirstAndSecond (first, second);

    if (comparer == null)
        comparer = EqualityComparer<TSource>.Default;

    using (IEnumerator<TSource> first_enumerator = first.GetEnumerator (),
        second_enumerator = second.GetEnumerator ()) {

        while (first_enumerator.MoveNext ()) {
            if (!second_enumerator.MoveNext ())
                return false;

            if (!comparer.Equals (first_enumerator.Current, second_enumerator.Current))
                return false;
        }

        return !second_enumerator.MoveNext ();
    }
}

So if performance is important you'll have to write your own code.


The Activity of Blogging

September 1, 2011 Written by Charles Cook

I noticed a couple of posts recently which discussed the activity of blogging — by Gabriel Weinberg on his blog, and Rick on Flip Chart Fairy Tales. They both mention a key benefit of blogging — that it forces you to understand what you're writing about.

Gabriel:

Blogging forces you to write down your arguments and assumptions. This is the single biggest reason to do it, and I think it alone makes it worth it.

You have a lot of opinions. I'm sure some of them you hold strongly. Pick one and write it up in a post -- I'm sure your opinion will change somewhat, or at least become more nuanced.

When you move from your head to "paper," a lot of the hand-waveyness goes away and you are left to really defend your position to yourself.

Rick:

Blogging forces you to put some effort into understanding your material and constructing a reasoned argument. Most bloggers, even the ones who irritate the hell out of me, usually have something interesting and thought-provoking to say, some of the time. The fact that we have to put some thought into our posts acts as a brake on our more idiotic tendencies

I find that even just thinking about writing a post engages me more with the subject matter. In fact I think about writing considerably more posts than I actually write. I know… but I still gain a lot from it. I particularly like it when I start researching something and it turns out to be more interesting than I expected. For example, my last post on Arrays and Enumerable.Last() seemed to be about a pretty trivial topic, almost not worth writing about, but when doing the research it I discovered an interesting post from 2004 about the design of the .NET Framework which had a direct bearing on what I was writing about.

Also, I find that working in an Agile development environment with not a huge amount of design documentation my technical writing skills get a little rusty and some blogging every now and then helps to sharpen them up.

Finally, from a practical point of view, I recently discovered something else about blogging — it helps to have a blogging environment with as little friction as possible (witness the burst of posts here recently). I rewrote my blogging engine as an ASP.NET MVC3 application and in the process added two features which make it considerably easier to write posts. First, I added support for Markdown[1] , which is so much better than hand-crafting HTML, which I'd been doing ever since I started in 2001; and second, I implemented accurate preview which makes proof-reading easier and reduces the risk of typos and other mistakes slipping through to publication (the preview can also be easily copied and pasted into an email in its exact intended final format if I want to give it to someone for review).


[1] I'm using MarkdownSharp to which I've added support for the fenced code blocks of GitHub flavored Markdown.


Arrays and Enumerable.Last()

August 28, 2011 Written by Charles Cook

Code which determines the index of the last item in an array like this has always irritated me a little, particularly when doing complex array manipulation:

object lastItem = myArray[myArray.Length - 1];

It's obvious enough but it would be nice to have a cleaner way of getting the index of the last item or the last item itself. The latter has a good solution, using the Linq Enumerable.Last() extension method:

object lastItem = myArray.Last();

I initially thought this could be sub-optimal because the implementation might traverse the whole array but according to Ed Maurer at Microsoft, in a reply on a Connect thread, Last()is optimized for source sequences which implements IList<T>:

Thanks for your investigation of the performance of Enumerable.Last(). I've inspected the implementation, and it has an optimization to deal with cases in which the source sequence implements IList<T> like your array case - cast to IList<T> and use the indexer method. I believe the implementation employs the most practical optimization available to us, and we won't invest further to improve the performance of this method. Thanks again for your comments.

The Mono implementation of Last() applies this optimization (but also illustrates the level of performance hit using Last()):

public static TSource Last<TSource> (this IEnumerable<TSource> source)
{
    Check.Source (source);

    var collection = source as ICollection<TSource>;
    if (collection != null && collection.Count == 0)
        throw new InvalidOperationException ();

    var list = source as IList<TSource>;
    if (list != null)
        return list [list.Count - 1];

    return source.Last (PredicateOf<TSource>.Always, Fallback.Throw);
}

Back in 2004 Brian Grunkemeyer blogged about what makes this possible:

When we were designing our generic collections classes, one of the things that bothered me was how to write a generic algorithm that would work on both arrays and collections. To drive generic programming, of course we must make arrays and generic collections as seamless as possible. It felt that there should be a simple solution to this problem that meant you shouldn't have to write the same code twice, once taking an IList<T> and again taking a T[]. The solution that dawned on me was that arrays needed to implement our generic IList. We made arrays in V1 implement the non-generic IList, which was rather simple due to the lack of strong typing with IList and our base class for all arrays (System.Array). What we needed was to do the same thing in a strongly typed way for IList<T>.

There were some restrictions here though - we didn't want to support multidimensional arrays since IList<T> only provides single dimensional accesses. Also, arrays with non-zero lower bounds are rather strange, and probably wouldn't mesh well with IList<T>, where most people may iterate from 0 to the return from the Count property on that IList. So, instead of making System.Array implement IList<T>, we made T[] implement IList<T>. Here, T[] means a single dimensional array with 0 as its lower bound (often called an SZArray internally, but I think Brad wanted to promote the term "vector" publically at one point in time), and the element type is T. So Int32[] implements IList<Int32>, and String[] implements IList<String>.

He goes on to describe how implementing this was decidedly non-trivial.

Finally going back to the first issue, now to determine the index of the last item in any array, it's possible to use Array.GetUpperBound():

int idx = myArray.GetUpperBound(0);

But this is a bit ugly because it's relying on the fact that an array type of T[] is a special case of System.Array. Perhaps it's better to use an extension method:

static class Extensions
{
  public static int LastIndex<T>(this T[] array)
  {
    if (array == null) throw new ArgumentNullException("array");
    if (array.Length == 0) throw new InvalidOperationException("zero-length array");
    return array.Length - 1;
  }
}

So we end up with:

int idx = intArray.LastIndex();

It's slower than using length - 1 but potentially safer because calling it on a zero-length array will result in InvalidOperationException being thrown. And it's not quite as succint as Perl's $#array syntax but that's another story.


Private Access Modifier in C#

August 26, 2011 Written by Charles Cook

Earlier today I noticed Miguel de Icaza (@migueldeicaza) was tweeting about the private access modifier in C#, including:

I really should make Mono's C# compiler warn every time someone puts "private" in members of a class. One way of teaching the language.

and

If you can't memorize the trivial c# visibility rules, you can't be trusted to write c# with lambdas, oop, generics, iterators and async

and the knockout blow, the WWJD argument:

Steve would not have approved that redundant atrocity that is 'private' had he designed c#

It started me thinking about the use of private. What does it actually give you? One way of looking at this is to consider what harmful effects there are if you don't use it but I can't think of any. It's the default so nobody is going to be able to inadvertently access your by default private class members from outside the class. If, unaware of the default, they try to do this the compiler will complain.

Sean Fao (@senfo) came up with some reasons for using private:

I prefer people use access modifiers because it's more clear and shows the author thought about it.

and Jon Skeet (@jonskeet) tweeted:

You say ugly, I say explicit :) I understand your POV, but I also see the benefits of showing, "I really made this choice."

I disagree about it being more clear and I don't see why you should have to demonstrate you thought about the choice of making the member private. Having private as the default means you don't have to think about it — private is the safe default as I mentioned above.

My conclusion is that using private just gives us a warm fuzzy feeling that we're writing better code but in fact doesn't make the slightest difference, and so, applying Occam's Razor, we shouldn't use private.