LINQ(-to-SQL), DateTimeOffset, and Time Zones

Filed in Software Developement Leave a comment

Today I was requested to produce a report on some timestamped data, but with the timestamps in the time zone where the data was collected.  Now long ago I learned the valuable lesson of using DateTimeOffset to store data and there is no change with this scenario, but something I had never actually tried before was getting LINQ-to-SQL (LINQPad) to emit timestamp data in a time zone other than my current system time zone.  This is really more of a general .NET solution to the same problem as i don’t believe LINQ-to-SQL actually converts any timestamp data on the data side (nor should it, we only want the DateTimeOffset values).

The crucial type in this case is TimeZoneInfo. First you need to figure out which time zone you want as your target (destination) time zone.  Use TimeZoneInfo.GetSystemTimeZones() to figure that one out, and make note of the Id property.  Now you need to create the proper TimeZoneInfo instance, you could use the results of that list, but more simply you can create the instance from the Id property using TimeZoneInfo.FindSystemTimeZoneById("Id") or for a real world example: TimeZoneInfo.FindSystemTimeZoneById("Newfoundland Standard Time"). The last part is to use TimeZoneInfo.ConvertTime(DateTimeOffset, TimeZoneInfo) to offset your DateTimeOffset to the proper amount.

Here is a full example:

TimeStampedData.Select(x => new { x.Name, Timestamp = TimeZoneInfo.ConvertTime(x.Timestamp, TimeZoneInfo.FindSystemTimeZoneById("Newfoundland Standard Time")) })

Nice and simple, but a little bit of a pain to actually get the right time zone.  Ideally, .NET would have static accessors to all the common standard time zones (as TimeZoneInfo static instances), but otherwise this is a reasonably easy solution to achieve.

, , , ,

Simplifying Complicated SQL queries in LINQ

Filed in Software Developement Leave a comment

I have been working pretty heavily in LINQ the past few days and ran into a few problems when dealing with some more complicated queries. While LINQ is fairly simple and intuitive when working with objects in memory, the LINQ-SQL bridge (or SQLite in my case) is a different machine. The reason it is different is that your queries must be translated into the appropriate SQL, and this SQL can sometimes turn into brutally inefficient queries.

In my case i was working on a query that inner joined three tables together, then grouped by one of the columns in the joined tables. Now LINQ can do this no problem with Join/Join/GroupBy. The difficulty lies in how you project your grouped columns. By default LINQ will just include the whole collection of grouped columns per group row. While this works great in memory, the SQL bridge butchers your call. The reason is that SQL (SQLite at least) can’t do what you asked so the translation is to group by, then do a select all for each group row. As you can imagine, the computational power required grows exponentially with the data size. This is quite clearly very bad.

When working in pure SQL, we can only include columns that are in the group key and columns with an aggregation function applied to them. The aggregation functions are not super obvious on how to use them, so this is what i will be simplifying in this post. Usually what you want is the Sum, Count, First Row, Last Row, etc… from your grouped rows. The main problem is that you can’t just call the LINQ aggregate functions normally, because it doesn’t make sense in the context of your group by projection. I’ll start off with some sample code and move on from there explaining as i go.

First we join the tables together, and project the join into an anonymous type with both values
var q1 = Table1.Join(Table2, x => x.t2key, x => x.key, (t1, t2) => new { t1, t2 })

Now we can group by (say by t1) so that we end up with t1, and an enumerable of t2′s.
var q2 = q1.GroupBy(x => x.t1.key, x => x.t2, (k, x) => new { k.Name, Timestamp = x.Max(y => y.Timestamp), Value = x.Max(y => y.Value) })
so the first lambda tells us what to group on (our parent key), the second lambda describes how to project the grouped data (we will just project the whole t2 table), and the third lambda describes how to project the result set. The third lambda is where things get interesting, we can include members of the key as is (k.Name), but for members of the grouped data, we must apply the aggregate functions or risk a horribly inefficient query. In the above example, i am using the Max aggregate function, which (in SQLite land at least, but others as well im sure) will select the last item in the group as determined by the order by clause (which i didn’t include). Because i didn’t include an order by then its just the last item in the grouped data however the grouped data was built. The order by must appear before the group by in order for it to affect the aggregation functions, so above i would do something like q1 = q1.OrderBy(x => x.t2.Timestamp) in order for the Max function to return the most recent value in the grouped data. If you specify an order by after the group by, it will order the projected group results, so you could do something like q2 = q2.OrderBy(x => x.Value) in order to show the values in ascending order, without affecting the group by aggregation. This is executed by a wrapper query, that uses the projected grouped data as input, so you execute a second query, but that’s it (and this one is generally quite fast)

So the important bits to take a way from this post is that any aggregated columns you want to project must be individually included in your group by result projection, and if you include any enumerations in your projection then they will be converted into subqueries. While my example may not be all that complicated, it is certainly more complicated than simple a select/join/where/order type of LINQ query, which are far more common. By understanding the group by projection core concepts, it should be relatively easy to construct much more complicated (AND efficient) LINQ queries. And remember, when in doubt play around in LINQPad, that program is the best for testing out LINQ queries.

, ,

A Quick Lesson On Simple Yet Confusing Depedency Property Issue

Filed in Software Developement Leave a comment

This is an issue that I just ran into and was a little confused as to why it was happening, but upon thinking about it for a moment it makes perfect sense.

When you are creating your DP’s, you have the option of specifying a default (initial) value. The problem with this value is that it is created and stored statically. So if you are creating a DP for, say, a collection, and you initialize it statically in the DP definition then your collection reference now exists statically. So regardless of whether or not you create ten instances of your DO, they will all contain the same initial value because they will all point to the same static reference.

So the conclusion to this quick lesson is that you should never initialize your DP’s with reference types, always perform this step at instantiation to guarantee unique DP values per object instance (unless of course you want to share the reference). As for value types it doesn’t matter nearly as much because the value is copied anyways.

, ,