Thursday, 12 January 2012

Playing with your member

(and: introducing FastMember)

Toying with members. We all do it. Some do it slow, some do it fast.

I am of course talking about the type of flexible member access that you need regularly in data-binding, materialization, and serialization code – and various other utility code.

Background

Here’s standard member access:

Foo obj = GetStaticTypedFoo();
obj.Bar = "abc";

Not very exciting, is it? Traditional static-typed C# is very efficient here when everything is known at compile-time. With C# 4.0, we also get nice support for when the target is not known at compile time:

dynamic obj = GetDynamicFoo();
obj.Bar = "abc";

Looks much the same, eh? But what about when the member is not known? What we can’t do is:

dynamic obj = GetStaticTypedFoo();
string propName = "Bar";
obj.propName = "abc"; // does not do what we intended!

So, we find ourselves in the realm of reflection. And as everyone knows, reflection is slooooooooow. Or at least, it is normally; if you don’t object to talking with Cthulhu you can get into the exciting realms of meta-programming with tools like Expression or ILGenerator – but most people like keeping hold of their sanity, so… what to do?

Middle-ground

A few years ago, I threw together HyperDescriptor; this is a custom implementation of the System.ComponentModel representation of properties, but using some IL instead of reflection – significantly faster. It is a good tool – a worthy tool; but… I just can’t get excited about it now, for various reasons, but perhaps most importantly:

  • the weirdness that is System.ComponentModel is slowly fading away into obscurity
  • it does not really address the DLR

Additionally, I’ve seen a few bug reports since 4.0, and frankly I’m not sure it is quite the right tool now. Fixing it is sometimes a bad thing.

Having written tools like dapper-dot-net and protobuf-net, my joy of meta-programming has grown. Time to start afresh!

FastMember

So with gleaming eyes and a bottle of Chilean to keep the evil out, I whacked together a fresh library; FastMember – available on google-code and nuget. It isn’t very big, or very complex – it simply aims to solve two scenarios:

  • reading and writing properties and fields (known by name at runtime) on a set of homogeneous (i.e. groups of the same type) objects
  • reading and writing properties and fields (known by name at runtime) on an individual object, which might by a DLR object

Here’s some typical usage (EDITED - API changes):

var accessor = TypeAccessor.Create(type);
string propName = // something known only at runtime
while( /* some loop of data */ ) {
accessor[obj, propName] = rowValue;
}

or:

// could be static or DLR
var wrapped = ObjectAccessor.Create(obj);
string propName = // something known only at runtime
Console.WriteLine(wrapped[propName]);

Nothing hugely exciting, but it comes up often enough (especially with the DLR aspect) to be worth putting somewhere reusable. It might also serve as a small but complete example for either meta-programming (ILGenerator etc), or manual DLR programming (CallSite etc).

Mary Mary quite contrary, how does your member perform?

So let’s roll some numbers; I’m bundling read and write together here for brevity, but - based on 1M reads and 1M writes of a class with an auto-implemented string property:

Static C#: 14ms
Dynamic C#: 268ms
PropertyInfo: 8879ms
PropertyDescriptor: 12847ms
TypeAccessor.Create: 73ms
ObjectAccessor.Create: 92ms

As you can see, it somewhat stomps on both reflection (PropertyInfo) and System.ComponentModel (PropertyDescriptor), and isn't very far from static-typed C#. Furthermore, both APIs work (as mentioned) with DLR types, which is cute - becaues frankly they are a pain to talk to manually. It also supports fields (vs. properties) and structs (vs. classes, although only for read operations).

That's all; I had some fun writing it; I hope some folks get some use out of it.

35 comments:

tomlev said...

Hi Marc,

Nice work! I don't know how many times I wished I had something like that... The use of indexers is a bit strange IMO (I would prefer methods like SetValue/GetValue), but still, it's a very useful library.

Marc Gravell said...

@tomlev your point is well-taken; it indeed may well be a good idea to add a parallel GetValue/SetValue API.

Matt Warren said...

Nice library, it was interesting looking through the code.

BTW, how can the Wrap() functionality of your library be faster than C# dynamic? Are you doing the same as dynamic under the hood?

James Tryand said...

Wow, Cheers Marc, was faffing around with Expression trees only a couple of days ago, for just this purpose.
This looks brilliant. Thanks :D

Marc Gravell said...
This comment has been removed by the author.
Marc Gravell said...

@Matt the Wrap/ObjectAccessor in this case is doing meta-programming per-type to provide a static implementation - this completely bypasses any DLR considerations.

If the type turned out to be a DLR type, then it would use the DLR (very similar code to the C# compiler), and the timings to "dynamic" would be very similar.

Anonymous said...

You, Sir, are a genius.

MikeWoodhouse said...

This might just be the answer I'd given up trying to find for myself. If it is, then I may just have avoided a lengthy side-project involving code generation and other sundry (interesting, admittedly) time-sinks. And I may owe you a beer. Or many.

Samuel Langlois said...

Hi Marc,

This is great! However it doesn't work with anonymous types. Do you plan to support anonymous types at some point, or is it too complicated to support them? My code is in VB by the way.

Also, what is that curious Fetch method in the TypeAccessor class? ;-)

Thanks.

Marc Gravell said...

@Samuel Anon types have an "accessibility" issue (they are internal, and in a different assembly) - it can be done though - will add. I don't remember a "Fetch" method... Will look.

Marc Gravell said...

@Samuel that should be fixed in 1.0.0.4 (deployed to nuget)

Matt Zinkevicius said...

Wow! This fills a niche that do many developers run into.

Looking at the code, I wonder why you chose to use the non generic collections (Hashtable) in some places? I had read that Dictionary was preferable and measurably faster in all cases.

Thanks for sharing this!

Marc Gravell said...

Re Hashtable - that was documented in a comment in at least one of the usages. Basically, Hashtable has preferable locking semantics - you can read without locking, and only need to lock to write. This makes it ideal for a double-checked scenario. Synchronization is necessary since the data is "static" and could be used by different threads.

With a generic Dictionary you need to lock for both reads and writes.

Adam Ralph said...

How about using a ConcurrentDictionary instead of a Hashtable? I was a bit dubious about the performance of this beast but after reading http://geekswithblogs.net/BlackRabbitCoder/archive/2010/06/09/c-4-the-curious-concurrentdictionary.aspx I decided to use it for something I'm working on at the moment.

Marc Gravell said...

@Adam tell you what: I'll measure it. Hashtable wast considered in those tests. the scenario I'm lookig at is mostly-hit, append only - which may influence things too. I've done similar comparisons before (choosing between concurrent, hashtable and disctionary) - thik Hashtable won, iirc.

Marc Gravell said...

@Adam here you go; perf measures taken 25M reads (single thread) assuming hits - not much in it:

https://docs.google.com/spreadsheet/oimg?key=0Akv6EOgOa_qSdDk5ZC1zdzZ2ZWM1aFR0YkJlVE9GdkE&oid=2&zx=sukl7m2wm6am

Marc Gravell said...

x axis is number of elements in the hash; y axis is time in milliseconds (small is better)

Adam Ralph said...

@Marc - OK, so not much in it really. Thanks for the stats, very useful to know.

One more thing - when writing such meta-code do you always verify that it works in assemblies built using both Debug and Release configs?

One of the projects I'm currently working on has some IL generating code in it (written years ago) and it simply doesn't work when built under Release config (I guess something to do with the /optimize switch). Until now, no one on the team has had the time to find out exactly why this is happening.

Perhaps x86/x64 is also a consideration?

I guess that all of the above potentially applies to *any* code but I'm interested to know if you've seen problems related to meta-programming in particular.

Marc Gravell said...

@Adam nothing specific leaps to mind - I do tend to validate both, but that is just because I often have extra tracing in debug. x64 vs x86 does tend to emphasise errors loading values (most notably the different forms of null). For complex IL work, I use peverify for checking.

If you want me to take at your IL code, I'd be more than happy (depending on how big it is...).

Matt Zinkevicius said...

Any technical reason for not supporting setting value type properties?

As-is, I'll have to leave a reflection path in my code to handle structs.

Cheers.

Marc Gravell said...

@Matt yes; to get/set the value, I need to unbox the value - but at that point I am no longer editing the same value. To do that I would need an API that either took the object as "ref" (which is not very convenient), or returned a freshly boxed object each time. In *either* case, I would be unboxing and boxing the object each time

Adam Ralph said...

@Marc - thanks very much for the offer. IIRC, the code is rather involved and tightly coupled, but if/when I get round to looking at it and if I can isolate an example of the specific problem then I may take you up on that.

Anonymous said...

What's the "with tools like Expression or ILGenerator – but most people like keeping hold of their sanity" suppose to mean? That's its hard or complex to use or something..? or..?

Marc Gravell said...

@Anonymous they are a little out of most app-developers' experience, and both of them demand a very different approach to most code, and a bit more knowledge of CLI internals than you normally need (C# hides a lot of grungy details).

Marks said...

Thank you for the FastMember.

Do you plan add some API for methods calling ?

Marc Gravell said...

@Marks that isn't usually as common - did you have a specific use-case in mind?

alex Degrand said...

Nice work!
I have noticed that:

rowValue=DbNull.value;
accessor[obj, propName] = rowValue;
=> cast exception even if propName of type nullable.

Marc Gravell said...

@Alex that is entirely expected; it is not legal to assign DnNull to something like int?, either in the language or runtime. Frankly I don't know why they even included DbNull, when regular null works just fine.

Use null!

Sean Kearon said...

This is super useful stuff - thanks!

In case you're interested, I've added field and property name caching to TypeAccessor. If you want to take a look, it's in my clone here.

theMan said...

The good thing about HyperDescriptor is that you only had to attach the types to it and the TypeDescriptor would work a lot faster, with FastMember you have to change your existing code in order to use it

Marc Gravell said...

@theMan indeed; similar but different usage scenario is intended.

Maciej said...

This is GREAT code.

However I think there is one word of comment missing in the post. As long as you have lots of objects and need to access their properties only a few times, you are be better off with reflection. Generating IL code is very expensive. But you're far better expert on that so I won't argue.

Marc Gravell said...

@Maciej if you have lots of *objects*, but limited numbers of *types*, then there is no contest: IL / FastMember will stomp all over reflection. Even with lots of types, it really doesn't take much reflection for it to exceed the IL generation cost. The `Emit` code here is *not* very expensive at all. Frankly, unless you have a very specific edge-case scenario in mind, I don't think that is a valid concern.

Unknown said...

Hi! Great library indeed. I was just wondering, how can I check if a Property/Method exist?

Dim oAcc As ObjectAccessor = ObjectAccessor.Create(obj)

oAcc("NotExistingProperty") = "dummy"

above code throws me an ArgumentOutOfRangeException

Shailesh said...

Hi Marc, Great work!!! Thanks. I came across one minor glitch for Virtual Properties when the inherited type overrides just the setter and not the getter. The logic can't detect the base classes getter.