Saturday, January 30, 2010

Linq2DataSet replaces the DataView

Remember the ADO.NET DataView? It is the old-fashioned way to filter an existing DataSet from the days of ADO.NET 1.0. They are useful for filtering and sorting DataSets that you already have on hand for other reasons.

Let’s say you want a list of all the Northwind employees who live in the United States sorted by LastName:

// If you use DataSets you can probably write your own EmployeeDS & SetupData():
EmployeeDS db = SetupData();

// Filter by Country = “USA”: Sort by LastName
DataView dv = new DataView(
    db.Employees,
    "Country = 'USA'",  // The Filter
    "LastName",  // Sort Order
    DataViewRowState.CurrentRows);

// the DataView consists of a collection of DataRowView, so you have to cast back
// to get to the strongly typed DataRows EmployeeDS.EmployeesRow
foreach (DataRowView dvRow in dv)
{
    EmployeeDS.EmployeesRow item = (EmployeeDS.EmployeesRow)dvRow.Row;
    Console.WriteLine(item.LastName);
}

In the code above, the filter and the sort are in strings, so the compiler can’t find errors; if the Employee table doesn’t have a “Country” field, the above code will compile but throw an EvaluateException at runtime.

In the foreach loop you need to cast the DataRowView into the strongly typed DataRow before you can access the properties. If you really don’t want to do the cast, you could use “dvRow["LastName"]” to get the last name, but that would involve another “magic string”.

Now, let’s get the same list in Linq2DataSet:

// If you use DataSets you can probably write your own EmployeeDS & SetupData():
EmployeeDS db = SetupData();

OrderedEnumerableRowCollection<EmployeeDS.EmployeesRow> linqRows = 
    from e in db.Employees
    where e.Country == "USA"
    orderby e.LastName
    select e;

// Linq2DataSet gives you a collection of strongly typed DataRows.
// You don’t have to cast to get to the LastName element here:
foreach (EmployeeDS.EmployeesRow item in linqRows)
    Console.WriteLine(item.LastName);

In this code, the only thing in a string is “USA”; the compiler will freak out if you don’t have a “Country” field. You don’t have to rely on angry user to find these bugs.

In the foreach loop you don’t have to cast, the output of the Linq query is a collection of strongly typed DataRows (however it isn’t a DataTable, but a DataRow can’t belong to more than one DataTable anyway).

If you really only want LastNames you can get only LastNames:

// If you use DataSets you can probably write your own EmployeeDS & SetupData():
EmployeeDS db = SetupData();

// Get me only LastNames as a collection of the field’s data type (in this case, string):
EnumerableRowCollection<string> LastNames = from e in db.Employees
        where e.Country == "USA"
        orderby e.LastName
        select e.LastName;

// Loop through the strings
foreach (string LastName in LastNames)
    Console.WriteLine(LastName);

Here Linq2DataSet gives us back a collection of strings because in the DataSet LastName is a string; if you ask for HireDate, you will get a collection of DateTimes.

Suppose you want a both LastName and FirstName you could do this (with Anonymous Types):

// If you use DataSets you can probably write your own EmployeeDS & SetupData():
EmployeeDS db = SetupData();

// Here we need to use “var” because the type doesn’t exist until the
// compiler creates it:
var Names = from e in db.Employees
            where e.Country == "USA"
            orderby e.LastName
            select new { e.LastName, e.FirstName };

// Here we are using the same type that the compiler created above:
foreach (var Name in Names)
    Console.WriteLine(string.Format("{0}, {1}",Name.LastName, Name.FirstName));

To make this work, C# created an anonymous type to put the results of the Linq query. Anonymous Types are strong types that are created at compile time; they just don’t have names and there are no class or struct declaration needed in the code.

(On my machine, according to the debugger, Names is a System.Data.EnumerableRowCollection<<>f__AnonymousType0<string,string>>)

More about using Anonymous Types in later posts. . .

Thursday, January 14, 2010

Linq2DataSet Introduction

Lately I’ve been working for a client that recently moved an existing application to VS 2008 / .NET 3.5. The app uses some DataSets, so I’ve been playing to what I call Linq2DataSet (or Linq to DataSet).

Getting Started

To use Linq2DataSet will need to use the .NET 3.5 runtime and, in addition to the usual ADO.NET assemblies you will need to add a reference to System.Data.DataSetExtensions.dll.

Quick comparison to Linq2Sql

// Linq To Sql
int Linq2SqlGetCount()
{
     BigTableSQLDataContext data = new BigTableSQLDataContext();
     return (from x in data.Big_Tables select x).Count();
}
// Linq to DataSet
int Linq2DataSetGetCount()
{
     BigTableDS data = new BigTableDS();
     Big_TableTableAdapter da = new Big_TableTableAdapter();
     da.Fill(data.Big_Table);

     return (from x in data.Big_Table select x).Count();
}

The code isn’t that much different. There is a little more code to setup the DataSet than the SQL DataContext. No big deal.

There is a big gotcha. I ran both functions against a table with 20,000 records. The DataSet is fully loaded before anything can happen. In the code above, the Linq2Sql function runs in < 1 second and the Linq2DataSet version takes about a minute and a half.

In Linq2DataSet, all 20,000 are loaded into the DataSet before the Linq code is executed. In Linq2Sql, the engine executes the Linq and generates the SQL to get the data (probably something like SELECT COUNT(*) FROM Big_Table).

Yes, this looks bad, however if you already have the DataSet hanging around, Linq2DataSet may be a Mega-Cool way to solve your problems. For the record, I would not recommend Linq2DataSet for new “Green Field” projects, however, if you already have the DataSet lying around, Linq can make it easy to get that little bit of extra info you need. Linq2DataSets is also a way to get into Linq quickly then you can learn about Linq2Sql or the Entity Framework.

More to come…