Monday, 16 November 2009

SelectMany; combining IDisposable and LINQ

Something that often trips me up in LINQ is that you frequently have to break out of the query syntax simply to add a few “using” blocks. It is important to keep the “using” to ensure that resources are released ASAP, but the regular LINQ query syntax doesn’t let you do this. For example (and I realise there are better ways of doing this – it is for illustration only):

image

Trying to write this (or something like this) in LINQ query syntax (or fluent syntax) just gets ugly.

If only there were a way to convince it to run the “Dispose()” for us as part of the query! A “let” looks promising, but this won’t do anything special in the error case. But there is a pattern (other than just “using”) that lets us conveniently run code at the end of something… “foreach”. And both query-syntax and fluent-syntax already play nicely with extra “foreach” blocks – it falls (loosely) into “SelectMany”.

Thoughts; we could write an extension method that turns “IDisposable” into “IEnumerable”:

image

image

This is a step in the right direction – but those hanging “Using” blocks bother me - and the disposable object isn't quite inside the iterator - there are probably some corner-cases that would lead to items not getting disposed. But LINQ syntax is purely a query comprehension syntax; we aren’t limited to the current conventions – we can write an alternative version of “SelectMany”:

image

(hey, I never said SelectMany was pretty!)

This allows a much more interesting use:

image

This now does everything we want, without any ugly – our disposable items get disposed, and we can write complex chained queries.

I still haven’t decided whether this is too “out there” – i.e. whether it adds more benefit than it does confusion. All thoughts appreciated!

13 comments:

Marc Gravell said...

After posting, I realised that the SelectMany should probably also be "where TDisposable : class, IDisposable" - I'll let you inject that mentally ;-p

wcoenen said...

I'm confused. How do you make a LINQ statement with multiple "from" keywords use your own version of SelectMany, rather than the default one?

wcoenen said...

Nevermind, I just tried it and apparently the compiler just looks for an extension method with the right name, not necessarily the LINQ one. I had no idea.

Joel Coehoorn said...

I think that's pretty slick, but I like the first ."Using()" extension method snippet better, if you can work out where the corner cases are.

It makes it more obvious to the reader that you have addressed the disposal issue. Maybe just name it something different that would flow better when reading the query. Something like "DisposeAfterQuery()", "Used()", or "AsSafeEnumerable()", but I don't really like those either.

Marc Gravell said...

@wcoenen - LINQ's translation is done *entirely* on patterns; a second "where" becomes "SelectMany" (with a few lambdas etc). The exact details are covered by "C# in Depth" (Skeet).

Thomas Levesque said...

Cool stuff !

However it seems a little too "magic" for my taste. Since the syntax is exactly the same as with the standard SelectMany, there's no way to be sure that the compiler picked your extension method rather than the standard SelectMany. If you forget to import your namespace, there's no way to detect it...

Like Joel, I think I actually prefer the first version with a Using method, because it's more explicit

Barry Kelly said...

Note that a foreach in C# already includes an implicit 'using'. The compiler will query for IDisposable on the IEnumerator and call it if it is implemented. All you should have to do is implement IDisposable on your root enumerator.

wcoenen said...

@Barry Kelly: that disposes the enumerator, not the items being enumerated. Obviously you don't *always* want to dispose the items being enumerated.

Marc Gravell said...

@Barry - that is exactly the trick that we are exploiting here; using the iterator disposal to trigger the disposal of the items.

@Thomas - I see your point, and indeed this is (in part) behind the last line in the blog; but in *general* it would be unlikely to be both `IEnumerable-T` and `IDisposable`, so there shouldn't often be any ambiguity (at the compiler level, at least).

casperOne said...

Thanks for the Twitter follow, btw =)

I wouldn't say that this is too out there, but in this case, you have a one-to-one relationship between the elements in the source, and each subsequent element you are transforming.

One is just as likely to use "let" here to specifiy the Stream and the StreamReader (it's what I would do, since there is a one-to-one relationship), and I'm not sure where that would leave you.

I do like the idea though if you are going to use that particular brand of query syntax.

However, if you use "let", I don't think you will have a SelectMany call anymore, rather, you will have a call to Select, along with an anonymous type.

To make that work in this case, you would need a Select method which would use reflection while iterating through the items, disposing of any members which implement IDisposable.

Messy, I know. =(

Marc Gravell said...

Indeed - and the reflection would get a *lot* worse because there is typically a level of abstraction between the next iterator. The SelectMany avoids that ;-p

Craig said...

Hi, I have a LINQ question that's closely related to this article. I've recently begun using the Coverity code analyzer to find problems in my company's C# code. Coverity complains that pretty much all of our LINQ queries open a disposable resource that is not disposed. While trying to make sense of this, I ran the following code as an experiment:

string[] lst = { "A", "B" };
var x = from s in lst select s;

This is obviously a pretty silly LINQ query, but it is sufficient to examine the question, what is the object returned from Select()?

As we all know, LINQ likes to defer queries. So x is not actually the result of our query, but an object that can perform the query and generate individual result elements as needed. The actual type is System.Linq.Enumerablec_Iterator10`2[System.String, System.String]. An interesting fact about this class is that it not only implements IEnumerable, but also IDisposable! This can be verified by checking the value of "x is IDisposable".

I have not seen this fact documented anywhere. MSDN doesn't seem to mention it; neither does Calvert and Kulkarni's book Essential LINQ.

What I would really like to know is, what are the risks of not calling Dispose() on the value returned by Select()? Does the object itself require disposal regardless of the type of the object being queried, or does it implement IDisposable only to be sure that the object being queried is disposed? Any ideas?

gchernis said...

Nice read!

These pesky SelectMany overloads are monadic.

http://blogs.msdn.com/b/wesdyer/archive/2008/01/11/the-marvels-of-monads.aspx