TIL Tuesday: Lambda Expressions in Query Operators, Extension Methods, and the 'params' Keyword

While working through a tutorial on Entity Framework, I came across a block of code that uses a couple features of the C# language that new programmers might find confusing.

context.Contacts.AddOrUpdate(p => p.Name,
   new Contact
   {
       Name = "Debra Garcia",
       Address = "1234 Main St",
       City = "Redmond",
       State = "WA",
       Zip = "10999",
       Email = "debra@example.com",
   },
    new Contact
    {
        Name = "Thorsten Weinrich",
        Address = "5678 1st Ave W",
        City = "Redmond",
        State = "WA",
        Zip = "10999",
        Email = "thorsten@example.com",
    },
    new Contact
    {
        Name = "Yuhong Li",
        Address = "9012 State st",
        City = "Redmond",
        State = "WA",
        Zip = "10999",
        Email = "yuhong@example.com",
    },
    new Contact
    {
        Name = "Jon Orton",
        Address = "3456 Maple St",
        City = "Redmond",
        State = "WA",
        Zip = "10999",
        Email = "jon@example.com",
    },
    new Contact
    {
        Name = "Diliana Alexieva-Bosseva",
        Address = "7890 2nd Ave E",
        City = "Redmond",
        State = "WA",
        Zip = "10999",
        Email = "diliana@example.com",
    }
    );

This statement is from the Seed method of a migration configuration file (ContactManager.Migrations) in an MVC project. The tutorial from which this code has been duplicated is here.

In this snippet context.Contacts is a IDbSet<Contact>, which is a collection of Contact objects. We can call AddOrUpdate() on this collection, a function we're using here to seed the collection with some seed data for the tutorial linked above.

Despite the introductory nature of the tutorial, a neophyte programmer may be initially confused by the call, which is nearly 50 lines long but consists of a single statement. New programmers might ask themselves:

  • what is the purpose of the 'p => p.Name` argument?
  • how can there be so many (or an arbitrary quantity of) arguments?

Searching for answers, one might check on the definition of the AddOrUpdate() function:

public static void AddOrUpdate<TEntity>(
    this IDbSet<TEntity> set,
    Expression<Func<TEntity, object>> identifierExpression,
    params TEntity[] entities
)
where TEntity : class

There's a lot going on here. Reasonable questions might include:

  • There are three arguments in the definition, but we only use two in the implementation. Why?
  • What's the purpose of the this keyword in the first parameter?
  • What does the params keyword do?

What is the purpose of the p => p.Name argument?

The p => p.Name argument is a lambda expression. In this context it's safe to interpret the lambda as a very concise function that takes p as a parameter and returns p.Name.

AddOrUpdate() uses that lambda as it processes each of the following parameters in the call.

Although it's the first parameter of the method, it maps to the second parameter of the method definition. We'll explore that anomaly next.

Why doesn't our use of the method include the first argument of the method definition? Is it related to the inclusion of the this keyword in the definition?

It is indeed related to the inclusion of the this keyword in the method definition. The AddOrUpdate() method is an extension method that appears, from our client code, to be member of the IDbSet<Contact> collection named context.Contacts.

It is not a member of the collection, though: it is a statically defined method that extends the Contacts collection with the AddOrUpdate method without altering the source of the Contacts class.

When extension methods are defined, the first parameter should include the this keyword, to signify that the extension method is an extension for the type specified after the this keyword.

That is why, in the definition of the method, the first parameter is this IDbSet<TEntity> set, and why when we actually use the method, we ignore the first parameter altogether - the first parameter is actually the object context.Contacts, the object to be extended with the AddOrUpdate() method.

How can there be so many (or an arbitrary quantity of) arguments?

The first parameter of the AddOrUpdate() method definition is the object that we're extending with the method. In practice, we'll not use it as a parameter, but as the object that has been extended with the method.

The next parameter is the lambda expression that represents the key that the AddOrUpdate() method will match with the data provided in the remaining parameter. But in our code example above, there isn't just a third and final parameter. There's five of them!

This is because the method uses the params keyword for the final parameter in the method definition:

params TEntity[] entities

The params keyword means that the third parameter is either an array of TEntity objects (it's generic, in our case our entities are going to be Contact objects) or a comma-separated list of objects of the TEntity type.

In our example code, the method has been called with a comma-separated list of new Contact objects, instantiated in situ.

What does it do, actually?

The Contacts member of the application database context is populated with the seed data - existing data in the Contacts that matches the Name property (p.Name) of the seed data provided in the method's parameters will be updated with the data in the object created in the the parameters. Data in the parameters that doesn't have a matching name in the existing Contacts collection will just be added to the collection.

Additional Reading

Lambda Expressions (C# Programming Guide)

params (C# Reference)

DbSetMigrationsExtensions.AddOrUpdate Method (IDbSet, Expression>, TEntity[])

Extension Methods (C# Programming Guide)

Enable Migrations, create the database, add sample data and a data initializer