Saturday, September 13, 2008

Linq Query Operators

Much of Linq's query operators are method extensions which take a typed iterator for input, and return a modified iterator of the same time as their output.

List<Client> clients = new List<Client>();
IEnumerable<Client> it1 = clients as IEnumerable<Client>;
IEnumerable<Client> it2 = it1.Where(c => c.Rating > 5);


This pattern allows for the piping of multiple query operators, the first one's output providing the next one's input :

List<Client> clients = new List<Client>();
IEnumerable<Client> it1 = clients as IEnumerable<Client>;
IEnumerable<Client> it2 = it1.Where(c => c.Rating > 5); IEnumerable<Client> it3 = it2.Where(c => c.Rating < 10);
IEnumerable<Client> it4 = it3.Where(c => c.Rating != 7);


The fact that Linq's query operators are built from iterators has important implications, as it enables the deferred evaluation of a query. That is because C# iterators only return their elements when they are asked for. For example, the following iterator will only perform it's first time consuming operation :

public IEnumerable<int> GetItems()
{
yield return this.TimeConsumingOperation(1);
yield return this.TimeConsumingOperation(2);
yield return this.TimeConsumingOperation(3);
}

foreach (int item in iterator.GetItems())
{
int firstItem = item;
break;
}


This means that a query operation will only be evaluated when its items are iterated over, and for only as many items as the iteration asks for.

Linq Type Inference

Given the following extension method :

public static IOrderedEnumerable<TSource>
OrderBy<TSource, TKey>(
this IEnumerable<TSource> source,
Func<TSource, TKey> keySelector)


Here is how C# infers the types of the following Linq expression :

List<customer> customers = new List<Customer>
{
new Customer { FirstName = "Alice" },
new Customer { FirstName = "David" },
new Customer { FirstName = "Bob" },
};

customers.OrderBy(customer => customer.FirstName);


The two generic types of the extension method need to be inferred, namely TSource and TKey. First, TSource is inferred as type Customer from the fact that the extension method is is applied to an IEnumerable of Customers. Secondly, TKey is then inferred as type String from the fact that the provided lambda expression returns a String.

We can then express the fully typed extension method that will be used :

public static IOrderedEnumerable<Customer>
OrderBy<Customer, String>(
this IEnumerable<Customer> source,
Func<Customer, String> keySelector)

Toward Functional Programming

Ever since the first release of C#, functions have been considered citizens of the language. Already then we could declare a typed function signature to hold a pointer to any matching method :

public delegate int MyDelegate(int x, int y);

public static int MyMethod(int x, int y)
{
return x + y;
}

public static void Main(string[] args)
{
MyDelegate myDelegate = new MyDelegate(MyMethod);
int result = myDelegate(1, 2);
}


The second release of C# gave greater importance to functions. Methods could then be instantiated from code, like any other object type. Delegates were no more limited at pointing to class members, as they could then handle anonymous methods :

public delegate int MyDelegate(int x, int y);

public static void Main(string[] args)
{
MyDelegate myDelegate = delegate(int x, int y)
{
return x + y;
};
int result = myDelegate(1, 2);
}


The third release of the language introduced generic delegates, which eliminated the need to declare a delegate for every possible function signature. Furthermore, a new syntax was provided to create methods from code, namely the lambda expressions :

public static void Main(string[] args)
{
Func<int,int,int> myDelegate;
myDelegate = new Func<int,int,int>((x, y) => x + y);
int result = myDelegate(1, 2);
}