I Work On Software: 2010

Wednesday, June 23, 2010

Cascading Deletes in EF4

Something I was curious about in Entity Framework was how to delete a whole graph of entities by deleting its root. In the DB world, doing this automatically is known as "cascading deletes", and in SQL Server it's something you can specify on a foreign key relationship. When you specify the cascade delete action, when the "parent" row is deleted from the database, dependent rows are silently deleted as well. This avoids running into foreign key constraints when you are deleting rows, but it can also result in deleting more data than you thought you were, so cascading deletes should be used carefully.

Anyways, I was curious as to how this idea was represented in EF, so I fired up a little demo app to try a few things out. After some experimentation, and consulting this excellent blog post, I think I've got it figured out:

Specifying cascading deletes on the database is entirely separate from specifying them on the client. By far the safest way to do things if you want cascading deletes is to specify them on the database. If you don't, some odd things can happen (keep reading).
To specify cascading deletes on the client, select the relationship in your model. In the properties window, for the end that represents the "1" multiplicity, set OnDelete to Cascade. This tells EF to issue delete requests for dependent entities that have been loaded into memory before deleting the parent.
Here's where things get interesting: Say I have a Student entity in memory, and the FK_StudentGrade_Student relationship is set up for cascading deletes on the client, but not on the DB. If I mark the entity for deletion and SaveChanges(), what happens? EF will issue a delete command for the Student after issuing delete requests for all StudentGrades that are loaded. EF does not retrieve all the dependent entities prior to deletion to learn about them so it can issue delete requests for them. If you don't have all the dependent entities loaded, you'll get an FK violation when EF tries to delete the parent.
What if you have cascade set up on both the DB and the client? EF will still issue delete requests for the dependents it knows about, which is redundant but harmless. If some dependents weren't loaded, the cascade rule on the DB will take care of them.
What if you don't have cascade set up on the client? It depends on whether the parent entity you are trying to delete has any dependents loaded. If it does, EF will stop you with an exception before it even issues the delete request to the DB, because it sees that you are violating a rule of the conceptual model. If the parent doesn't have any dependents loaded, EF will go ahead and issue the delete request. From there, the behavior is determined by whether or not any dependent entities exist in the DB and whether or not you have cascading deletes specified on the server. If the answers are "yes" and "no," respectively, you'll get an FK violation.

Highlights of the Entity Framework's "Working With Objects" Documentation

I'm trying to get my head around Entity Framework 4 and the official MSDN documentation is pretty helpful in explaining the high level concepts. In particular, the "Querying a Conceptual Model" and "Working With Objects" sections are very useful when it comes to learning about how to actually get code that uses EF4 up and running.

I read through "Working With Objects" last night and this is a quick bullet-list of the highlights.

Defining and Managing Relationships

If you include foreign key columns in the model, you can create/change relationships by manipulating the foreign key, or by manipulating the navigation properties. This is called a foreign key association.
If you don't include foreign key columns, relationships are managed as independent objects in memory. This is called an independent association. With these, the only way to change/create relationships is by manipulating the navigation properties.
When a primary key of an entity is part of the primary key of a dependent entity, the relationship is an identifying relationship - the dependent entity cannot exist without the principal entity. In this case, deleting the primary entity also deletes the dependent entity, as does removing the relationship.
In a non-identifying relationship, if the model is based on foreign key associations, deleting the principal objects will set foreign keys of dependents to null if they are nullable. If they are not, you must manually delete the dependent objects or assign a new parent, or you can specify in the model to automatically delete the dependent entities.

Creating, Adding, Modifying, and Deleting Objects

The CreateObjectName static method on an entity type is created by the EDM tools when the types are generated. The parameters for this object include the columns that cannot be null, and nothing else.
If the primary key column is exposed and not nullable (and thus exposed in the signature to CreateObjectName), you can just assign the default value, since the ID used prior to SaveChanges is always temporary.
When using POCOs, use CreateObject instead of new.

Attaching and Detaching Objects

Detaching objects can help keep the memory requirements of the object context in check. If you execute a query with MergeOption.NoTracking, the returned entities are never attached.
Consider creating a new ObjectContext if the scope of data has changed (like if you are displaying a new form) instead of detaching objects from ObjectContext.

Identity Resolution, State Management, and Change Tracking

If you are working with POCOs without change-tracking proxies, you must call DetectChanges so the EF can figure out what changes have been made.
To find out if the value of a property has changed between calls to SaveChanges, query the collection of changed property names returned by the GetModifiedProperties method.

Saving Changes and Managing Concurrency

By default, EF uses "optimistic concurrency," meaning that locks are not held while your app has data from the data source, and object changes are saved to the database without checking for concurrent modifications. If an entity type has a high degree of concurrency, you can set ConcurrencyMode to Fixed, which will cause EF to check for changes in the DB before saving. Conflicting changes will throw an OptimisticConcurrencyException.
You can handle OptimisticConcurrencyExceptions by calling Refresh with your RefreshMode of choice.
In high concurrency scenarios, call Refresh frequently. The RefreshMode controls how changes are propagated: StoreWins causes EF to overwrite all data in the cache with DB values. ClientWins replaces original values only in the cache with DB values.
Call Refresh after calling SaveChanges if your updates may modify data that belongs to other objects. For example, if you have a trigger that fires when you update a row in a table, calling Refresh after saving changes to that table is a good idea.

Binding Objects to Controls

(From http://social.msdn.microsoft.com/Forums/en/adonetefx/thread/c76d72c0-951c-4033-b75f-fc84f735826e): EntityCollection represents a collection of entities related to another entity (a many relationship). ObjectSet is not actually a collection type: it provides access to entities that belong to an entityset in the DB. LINQ on an EntityCollection is LINQ to Objects. LINQ on an ObjectSet uses LINQ to Entities (results in DB query).
To bind entities directly to a control, set the DataSource property of a control to an EntityCollection or to the ObjectResult you get when you call Execute on an ObjectSet or ObjectQuery. In WPF, you can set a DataContext to an EntityCollection or ObjectResult.
Don't bind directly to an ObjectQuery, bind to the result of Execute.
When working with LINQ, cast the result of your query to an ObjectQuery and you can call Execute on it.
To refresh the bound control, call Execute on the query again. This will bind the control to a new ObjectResult.
Calling OfType on an EntityCollection returns an IEnumerable, which cannot be bound to a control. Instead of OfType, use CreateSourceQuery to get the ObjectQuery that defines the base EntityCollection. You can then bind a control to the execution of the ObjectQuery that is returned by OfType on the ObjectQuery.

Serializing Objects

When serializing entities, if lazy loading is not disabled, it will be triggered, possibly causing your object graph to become very large.
Binary and WCF datacontract serialization serialize related objects. XML serialization does not.

Friday, May 28, 2010

Silverlight, WCF, Deserialization and Public Setters

I've been working recently on doing "WCF by hand," meaning putting ServiceContracts and DataContracts in separate assemblies, sharing those assemblies between server and client projects, and using ChannelFactory or manually writing ClientBase classes to access the service instead of dealing with all the cruft that Add Service Reference pushes on you (post forthcoming!).

I've been working on doing this in Silverlight by adding all of my DataContract class files as links into a Silverlight project, essentially recompiling the DataContracts in Silverlight so I can share them with a Silverlight app. The first thing I learned is that it may be worthwhile for you to do this with DataContracts, but don't bother with your service interface - you really need to use slsvcutil or Visual Studio's Add Service Reference to generate Silverlight proxy clients, because Silverlight mandates the use of specific asynchronous patterns implemented in a specific way. However, with the default settings, Visual Studio will reuse DataContracts in referenced assemblies, meaning that your business objects won't end up as generated code, and you can keep all your logic and calculated properties.

The next and most interesting thing I learned was regarding the presence of public setters on DataMembers. One of my data contracts has a private setter - this contract and its service are working fine with a standard console application client, but as soon as I access the service in Silverlight, things blow up. Check this out:

System.Security.SecurityException: The data contract type 'MyContract' cannot be deserialized because the property 'MyPrivateSetterProperty' does not have a public setter. Adding a public setter will fix this error. Alternatively, you can make it internal, and use the InternalsVisibleToAttribute attribute on your assembly in order to enable serialization of internal members - see documentation for more details. Be aware that doing so has certain security implications.

Whoa. What's the deal?

Quite simply, it's important to never forget that Silverlight code is only partially trusted. Without full trust, System.Runtime.Serialization can't instantiate a raw object and populate it's properties, it needs to use the same method that developers do from user code; namely, a public setter.

As the exception text states, the alternative is to make System.Runtime.Serialization a friend assembly of your data contract assembly via InternalsVisibleTo, and make the setter internal instead of private. Apparently, if you are using the JSON serializer, you will also need to friend System.ServiceModel.Web and System.Runtime.Serialization.Json as well.

As an aside, kudos to the teams that take the time to add exception text like this. Seriously, there's not much better than getting an error with text like, "Try approaches X, Y or Z to fix this, and check out the documentation too."

Sources:
MSDN Forums
Silverlight Serialization - avoiding having public setters in properties

Sunday, May 23, 2010

A Rant on Reuse

From a design and development perspective, code reuse is beautiful. Really beautiful. Code that's reusable is, by definition, highly decoupled, well-encapsulated and well organized, with a bevy of options that are available via simple configuration. It checks the "Don't Repeat Yourself" checkbox (and the "Three Tenets of Object-Oriented Programming" checkboxes, if you swing that way) so hard that the pen pierces the paper. Reusing that code is like the fulfillment of a prophecy. It feels good. The elegance of it all is so alluring that I think every developer sits down once in a while to try to solve their current problem with a glorious, reusable library. It's like writing the Great American Novel for geeks.

From a management point of view, reuse is obviously a no-brainer. When you view development as manufacturing (fast forward to 20:30), reuse looks a lot like replacing a human assembly workflow in the production of your widget with a mechanized one - you gain speed and reduce variability and error rate. The return on investment is clearly fantastic.

I have a vision in my head of a meeting taking place between IT professionals and upper-middle management at a technology-related company sometime in the 70's, maybe the 80's, as they are about to break ground on a new application.

Exec: "... Alright then, let's design and build it. Get to work."
IT Pro: "You know, if we are careful to keep these functions separate and generic, I think we could reuse them for other purposes later.
Exec: "You can... reuse them? Like, take code from one application and plug it into another? For free?"
IT Pro: "Absolutely. Like your golf clubs - you don't have a separate set of clubs for the 5 private courses you belong to, you take the same set of clubs to each one. A set of clubs is designed to be flexible and accommodating."
Exec: "Well this is a great idea. We need more of this. This is the norm from now on - I expect to see you start reusing everything."

Management was excited because their scorecard numbers were going to go up. The developers were excited because they got a mandate to build beautiful things. And thus was born the fallacy that internal code reuse is free, easy, and something we should be doing all the time.

As a learning exercise, I took one chunk of functionality from one of our applications I'm working on and set about making a crisp, clean, fully reusable WCF service out of it. I wanted to see the result from a purely technical standpoint - I didn't really care about the potential return on investment, I just wanted to see if I could work through the little fiddly things that make a service ugly and tightly coupled unless you clean them up.

From a technology point of view, I'm happy with the result and learned a lot about making a great service. It took a while, even though it only provides a very discrete bit of functionality, but it's shiny and beautiful. It's even got documentation. I can use what I learned in the future, even on services that I don't intend to be reusable, to make them clean and easy to use.

The real learning, though, took place after I finished, while I was reflecting on my sparkling work and started to ask myself what I now realize are the most important questions.

Who is ever going to need this functionality?
If they need it, how likely is it that they're going to find out about this code I wrote, even if they're in my same organization?
If they find out about it, how likely is it that they're going to want to spend the time reading the documentation to figure out how to use it? (From a more general perspective, it's a stretch to assume that there's documentation in the first place).
If they read the docs, how likely is it that they're still going to want to use it when they realize it's going to need additional features to support their needs?
If they still want to use it, how likely is it that they're actually going to jump through the organizational hoops to get access to the code, do development on my code to add features (or get me to do it) in a way that keeps it reusable (difficult and time-consuming), and ensure that the new version can be deployed without messing up the application that currently uses it?

All that for one small piece of functionality. That funnel is pretty narrow at the top and tapers to the width of a hair at the end. Reuse is supposed to be a best practice; it's supposed to make your arsenal of applications clean and organized, and reduce the amount of work you need to do. Why does reuse all of a sudden look so difficult and expensive in light of these questions?

The answer: Internal reuse trades one kind of technical work for another that's just as difficult if not more, and adds extra organizational work on top. The new work that you've bought yourself is the most nefarious kind: the kind that looks like it's free. The kind that ends up as estimate line-items with .5 hours because they have to be in the project plan, but they won't result in new deliverables, so they must require no effort. Even worse, this cost isn't only paid the first time a component is reused - it's paid every time it's reused. At least when it comes to reuse of internal code (as opposed to third party controls and frameworks), I believe that Not Invented Here syndrome is less of an unwillingness to adopt work from another culture and more of a subconscious rejection of this new work that we know we are generating, but can't quite put our fingers on.

The technical work being traded out is new development. Everyone knows that the best code is no code, or perhaps more appropriately in this case, code written by someone else who must be smarter than the herd of cats that is your current development team, so wiping new development off of the project plan looks great.

There's no such thing as a free lunch, though. First, let's look at the code we're thinking about reusing. To be worth reusing, code has to solve just about every aspect of a very distinct and common problem from just about every angle, be well encapsulated, be distributed in a way that supports reuse, and above all, it must be hard to create. Really hard. If you want to benefit from its reuse, solving the new technical and organizational problems that reuse introduces must be easier than initially creating it.

The only way code can hit all those criteria is if it's developed away from a project that's going to be using it. It has to be its own effort, and that effort has to involve looking at (and testing) lots of different scenarios and use cases, not just the one that your new app needs and that you think some other apps are going to need later. It doesn't have to start off that way, but it has to end up that way before it's truly reusable. Reusable components aren't parts of projects, they are projects. Unfortunately, developing a component in isolation that doesn't solve a full business problem, but might solve a part of multiple common problems, almost never looks like a good short-term investment, and thus almost never happens.

Now, without knowing yet if this old code you're looking at can really solve your problem, your team, Team ABC, now has the job of understanding and using someone else's software, arguably the second-most reviled task in development. To misquote jwz, "Some people, when confronted with a problem, think 'I know, I'll reuse some code we already have.' Now they have two problems." Unlike a lot of other industries, in software development, understanding and leveraging someone else's work is often actually harder than producing your own.

So, flush some design hours down the drain, and what's the result? The ABC designers come back and say, "This almost solves our problem. It needs features X, Y and Z." Now they've got to add features onto existing code someone else wrote, arguably the first-most reviled task in development, and certainly one of the hardest. This is where the organizational problems, which began well before your project was set in motion, start to clearly manifest themselves.

The code that ABC wants to add features to and reuse was written by Team 123. Team 123 isn't a permanent team; it was assembled for a project two years ago. Three of its members have moved on to other organizations. No one knows where the documentation is, how good it is, or if there was any to begin with. There are three code bases scattered throughout source control and no one's sure which one's the golden one. Furthermore, no one is sure which applications are using the component in question, and which quirks they rely on, so the potential for making breaking changes is high.

And this all assumes that Team ABC had heard of Team 123's code in the first place. In an organization of any size, unless you have a full-time "code reuse librarian" (you don't), the chances of this are virtually nil.

In the end, what it comes down to is that your problem probably isn't common enough or hard enough to warrant development of a reusable component, and you probably don't have the budget or time to put in enough work on something to make it reusable when it doesn't fully solve a business problem by itself. By all means, make your code beautiful. Design it in a loosely-coupled, encapsulated fashion, because it will be easier to maintain. Just remember that while reusable code is loosely coupled and well-encapsulated, loosely-coupled and well-encapsulated code isn't necessarily reusable. Reusability is a meta-feature, and if you want your code to be reusable, you have to design and build all of it around that. Don't try too hard to reuse code or create reusable code - focus on solving business problems.

-----

Side note on SOA: SOA aims to reduce the organizational problems caused by reuse. It does not address the technical ones, nor does it preclude the need to spend extra time developing truly reusable components. In fact, it mandates it - you can't have SOA without solid, reusable code.

One more side note: Interestingly, a Google search on "code reuse" without any adjectives or qualifiers pulls the following three articles on the first results page:

Obviously, most topics searched for anywhere on the internet will result in a mix of positive and negative opinions, but "code reuse" is often billed as a best practice, when clearly it causes a lot of problems. This is true for a lot of other practices billed as "best practices" as well, like unit tests and big-design-up-front, but the nature of code reuse can cause people to put it in a bucket with practices that are pretty tough to argue with, like loose coupling or encapsulation, when it really shouldn't be.

Thursday, May 20, 2010

Fiddler and WCF

If you've never heard of Fiddler, I strongly encourage you to give it a try. The interface is a bit overwhelming at first, but after you spend a couple minutes getting used to it, you'll realize how powerful it is. With a minimum of configuration (in most cases, just install it and run it), you can start capturing all the HTTP traffic that goes in and out of your machine so you can pull it open and look at it.

To get started, all you really need to know is that in the default view, individual requests fill up the list on the left. If you click on a request, the "statistics" and "inspectors" tabs on the right will fill up with interesting information. The most interesting tab tends to be "inspectors," which will show you the request on top and the response on the bottom. The different tabs, like TextView, HexView, Raw, XML, etc. will change the visualizer used to show you the HTTP stream.

If you're playing around with a WCF application, Fiddler is the fastest and easiest way to see exactly what's going across the wire. However, there are a few things you might need to know about configuring WCF so that Fiddler captures the traffic.

I was going to write a post that gathered up all the tips I had found, but fortunately, someone has already done that for me - unlike a lot of other articles I've read that mention WCF only in passing and stick to getting Fiddler up and running with browsers, Rick Strahl has an excellent post about all the finnicky things you might run into when you're trying to get your WCF application traffic logged. Some of the information seems like maybe it's a bit out of date (for example, on Win7/VS2010/.NET4/Fiddler2, trying to use the "extra dot" trick with 127.0.0.1 to log traffic going to my Cassini debugging server wouldn't work, but "localhost" with an extra dot does, contrary to his post. In any case, the information in his post will at the very least point you in the right direction. Make sure to read the comments too, as there are some insights there as well.

My problem the other day that was driving me bonkers was that I had everything set up correctly, and by forcing use of a proxy I could essentially prove that my traffic was going through Fiddler (my client app would fail when Fiddler wasn't running but would succeed whe nit was), but the traffic wasn't showing up. Rick's post pointed out the easy to miss process filter at the bottom of the Fiddler window - mine had gotten flipped to Web Browsers and so it wasn't logging any traffic that wasn't from IE or FireFox.

UPDATE: If you want to do the "localhost-dot" trick with Silverlight when debugging on Cassini -first, change ServiceReferences.ClientConfig or whatever code or configuration you need to to point the client proxy at "localhost." instead of "localhost." Next, F5 to load up the debugger. When your site loads, change "localhost" in the address to "localhost.". This will re-download and re-run the XAP from "localhost.", so you are free to contact the service and it will be logged through Fiddler. The other way to do it, without changing the browser address, is to add a clientaccesspolicy.xml to your Web project that grants the appropriate permission - Cassini will host this too, so when you run your XAP from "localhost" and it tries to contact services on "localhost.", Silverlight will find it.

Thursday, April 29, 2010

Fusion Logs

I wanted to test out a neat little library I whipped up over the past week on a fresh image of Windows 7: no Visual Studio and no applications that contain dependencies for my library (which I am able to provide individually and independently of the application, if necessary). I wanted to find out exactly which assembly references were needed and where they needed to be placed.

Fusion Logging to the rescue! Fusion is the .NET runtime's "assembly finder", and is responsible for finding the assemblies that your application needs to run, whether they are in the application executable's folder, an appropriately-named subfolder, the GAC, or wherever. Fusion is essentially "silent" by default but with a few tools and registry tweaks you can force it to be very vocal when it's looking for assemblies.

The first stop for most developers should the Fusion Log Viewer, fuslogvw.exe. This is installed with the .NET SDK and you can find it easily on Win7 by typing "fusion" into the Start menu. All the log viewer does is provide a nice friendly interface over a few registry switches and folder locations where the logs are dumped to.

If you don't have the SDK installed, like I didn't on my fresh Win7 image, you can twiddle some bits in the registry to manually enable logging, redirect the logs to a location that's easier to find, and then manually investigate the logs yourself. Junfeng Zhang's blog post here has a great overview of the different registry values you can set to control logging.

One setting Junfeng does not mention is the HKLM/SOFTWARE/Microsoft/Fusion!EnableLog DWORD registry value. Junfeng says: "By default, the log is kept in memory, and is included in FileNotFoundException (If you read FileNotFoundException's document, you will find it has a member called “FusionLog“, surprise!)". However, if the EnableLog registry value isn't present and an assembly load fails, the FusionLog property will only contain a message that says you need to set EnableLog to 1 to see the load failure information. If you set EnableLog in the registry to 1, no log information will be written to disk, but the FusionLog property will show you what you want to see. A handy feature of FileNotFoundException is that if it is thrown due to an assembly loading failure, the message in the FusionLog property is included in the exception message.

Flipping some or all of the aforementioned registry bits might be a good idea on a test or developer machine to help debug loading problems.

Thursday, April 22, 2010

BizTalk Assembly Reflection In a Child Appdomain

I'm just wrapping up some work I'm doing on a robust way to perform reflection on BizTalk assemblies (we need to be able to inspect BizTalk assemblies directly for a big efficiency-boosting project going on here) and wanted to share a few things I learned about reflection.

First, specific to BizTalk: investigating BizTalk assemblies with Reflector and then writing code to get information based on the types you can find with standard reflection will only get you so far. About as far as orchestrations and anything contained in them, as a matter of fact. While schemas, maps and pipelines get compiled to fairly simple objects that subclass from common BizTalk types, orchestration types and subtypes (port types, message types, etc). have many more interesting properties and are generated by the BizTalk compiler in much more convoluted ways.

BizTalk ships with two assemblies, Microsoft.BizTalk.Reflection.dll and Microsoft.BizTalk.TypeSystem.dll, that can help you. They are largely undocumented and searches for them will only give two good hits, both on Jesus Rodriguez' weblog: here and here. According to Mr. Rodriguez, Reflection.dll is fairly robust and gives you a friendly interface through the Reflector class, and TypeSystem.dll blows the doors off and should give you access to pretty much every speck of metadata you can squeeze from a BizTalk assembly. I'm using Reflection.dll and I'm finding that it does everything I need, although it requires some experimentation: Mr. Rodriguez's posts are enough to get you started, but plan on spending some time playing in the debugger with some simple BizTalk assemblies figuring out how information is organized, particularly in the case of orchestrations. If I get a chance I'll make a post detailing a few things I found - I spent a good chunk of time discovering that Orchestrations are referred to as Services, and the direction of a port is indicated by the Polarity property on PortInfo (which is redundently represented in a number of forms in the Implements, Uses, Polarity and PolarityValue properties).

The other thing I wanted to talk about is what happens when you load assemblies. Now, I'm not an expert regarding reflection, loading and handling assemblies, or Fusion, but the one basic thing you should know about loading assemblies is that you can't unload them. However, the logical container that you load an assembly into is an AppDomain, which can be unloaded. Your application runs inside an AppDomain and you can use the AppDomain class to spawn child AppDomains that you can use for reflection or whatever other nefarious purposes you like. If you load an assembly into your main application's AppDomain by simply doing something like Assembly.Load, that assembly will be loaded into memory until the application is terminated. This also locks the assembly, unless you use something called shadow copying, which I won't get into here, but Junfeng Zhang has a great blog post about it here. Even the ReflectionOnly load methods will lock the assembly and load some data into memory that you can't get rid of until the AppDomain gets trashed. For my purposes, this was bad news, because we are doing the reflection in an IIS-hosted service that can live for quite a long time, and the requirements for the application include the ability for users to modify their BizTalk assemblies at will.

The answer, of course, is to perform reflection inside a child AppDomain, pull the data out in a form that doesn't require the reflected assembly itself, and then trash the child AppDomain when you're done. Creating and unloading AppDomains and injecting code into them is fairly simple and is covered pretty well in a couple of blog posts by Steve Holstad and Jon Shemitz, here and here respectively. Mr. Holstad's post has sample code that you can download and use to get started. If you look at Microsoft.BizTalk.Reflection.Reflector inside of Reflector (that's a lot of reflecting), you'll see that it simply does a LoadFrom on the file path you give it, so as long as you get it running inside a child AppDomain, you'll be good to go.

Clean Paths and Names in VSTF

Something I've always noticed is a struggle when on a team is creating folder structures for solutions and projects in VSTF that don't end up being redundant and hitting character limits. When you're supposed to name your assemblies with an Organization.Group.Team.Project.Assembly convention, folder path lengths can reach into the stratosphere unless you know how to appropriately massage Visual Studio into giving you the folders you want.

My recommendations for clean, short paths are as follows:

Pick a folder that's going to contain your solution folder, like "Source." After following this step, you will end up with a folder in Source called Org.Grp.Team.Project: Get the full path to Source, do a File > New > Project, and create an empty solution (under Other Project Types > Visual Studio Solutions) in Source. Check both the Create Directory For Solution and Add to Source Control boxes and click OK. This will open the solution in Solution Explorer.
To add a project to the solution and to source control, right click on the solution and do an Add > New Project. Here's where some knowledge about how Visual Studio behaves is helpful: the value you enter for Name will be the name of the folder that's created. We don't want to name it Org.Grp.Team.Project.Assembly, because that introduces a lot of redundancy and makes our paths too long. The name you want here is just the "Assembly" part. Click OK and you'll get the new project in the solution. The name next to the project entry in the Solution Explorer tree only represents the name of the project file, not the name of the assembly. To properly name your assembly, go to the project properties and change the value for Assembly Name (I also recommend changing Default Namespace) to Org.Grp.Team.Project.Assembly or whatever you like.

Now you'll have nice short folder paths and short, easy to read project names in Solution Explorer, but your assemblies will come out with the names you want.

If you need to change a folder name later, you don't need to worry about the project file in that folder, but you will need to fix project references that point to that project, as well as solution files for solutions that contain that project.

My last tip is that it's sometimes a good idea to copy the directory mapped to the project root to a temporary backup, wipe out the original, and do a full force get from TFS. This will eliminate any garbge folders and files that have accumulated on your machine when doing things like renames or removing projects or files you decided you didn't want from a solution - the backup is just in case some of those folders or files aren't garbage, like important stuff that didn't get checked in for some reason.

Wednesday, April 21, 2010

The IDisposable Pattern

I've been working on a little project involving creating a class that implements IDisposable, and I was curious about the pattern I had frequently seen by Reflectoring BCL classes and looking at other peoples' code that involved a Dispose(bool dispoing) method, calls to GC.SuppressFinalize(this) and a few other things. I started digging and hit upon a huge amount of detail regarding the optimal IDisposable pattern that the framework designers intended, destructors/finalizers, how the memory of managed resources is recycled vs. how unmanaged resources are handled, where Dispose(bool disposing) enters the picture, and a lot of other fascinating stuff.

I gathered up a bunch of links with a lot of technical detail and I was gearing up to write a big post about implementing the pattern properly, but then I found Shawn Farkas' excellent MSDN Magazine article "CLR Inside Out: Digging into IDisposable" and realized that the work had already been done. In just a couple of pages, this article explains in extremely lucid detail how the pattern works and why it should be implemented just so.

If you're looking for an extreme amount of detail, Joe Duffy posted an update to the "Dispose, Finalization, and Resource Management" chapter of the Framework Design Guideline on his blog in 2005.

Here were my key takeaways after researching all of this:
• C# doesn't really have "destructors." It only has "destructor syntax," which is the required shortcut for implementing Finalize() on a class (the compiler will warn you if you implement a method called Finalize()). It basically ensures that Finalize() does all the things it really should do in every single case: marking it as protected, carrying out all actions inside a try block, and calling base.Finalize() inside a finally block.
• If your class is sealed and you only use managed resources (i.e. stuff that needs Dispose called on it), just implement Dispose(). For each resource, check for != null and then call Dispose() on it. That's it.
• If your class isn't sealed, implement the full pattern (including GC.SuppressFinalize(this)), but leave out the finalizer unless you are using unmanaged resources. Adding a finalizer incurs a cost, even if you Dispose of the object and call GC.SuppressFinalize(this) in Dispose(). By calling GC.SuppressFinalize(this), even if you don't have a finalizer, you ensure that you suppress finalization for any subclass that has a finalizer.
• Any class with a finalizer should implement IDisposable. By the same token, if you inherit from an IDisposable that has a finalizer on it, do not define your own finalizer - the ancestor's finalizer will call your Dispose(bool disposing) method, where the cleanup is done.
• It often makes sense to contain some kind of state in your object so it can tell if it has been Disposed of. You can have instance methods check this state and throw ObjectDisposedExceptions as appropriate.
• A subclass of an IDisposable that properly follows the pattern only needs to override Dispose(bool disposing). Its implementation of it should dispose of resources using the same basic pattern and then call base.Dispose(disposing).
• Dispose() and Dispose(bool disposing) should never fail, but you shouldn't suppress exceptions that may be thrown from them. Often what makes sense is a try/finally where the resource cleanup is done inside the try block and a call to base.Dispose(disposing) is done inside the finally block.
• Finally, after all this talk about finalizers: you should basically almost never be implementing one. I don't know a whole lot about unmanaged resources but what I've read is that after .NET 2.0, pretty much all unmanaged resources can be controlled through SafeHandles, which take care of finalization for you in a robust way.

A few other good resources:

• Bryan Grunkemeyer's post on the BCL Team Blog about Disposing, Finalization and Resurrection.
• A good StackOverflow conversation about implementing the pattern.
• The MSDN documentation about implementing a dispose method.
• A little bit of info about the "freachable" queue and what happens when you put a finalizer on a class.
• A bit of clarification on "destructor" vs. "finalizer" as pertaining to C#: here and here.

Tuesday, April 20, 2010

Disabling Moles Pre-Compilation

One thing about Moles that I've found a bit annoying is that the Mole types always get compiled into an assembly. This is fine when you mole something like mscorlib because it's not going to be changing, but when you mole your own code, the result is that you have a dependency on a binary file that keeps changing over and over. If you check your code into source control, this is a nuisance.

It turns out that if you open a .moles file and add <Compilation Disable="true"/>, Moles won't generate an assembly and will instead put the source directly into your test project. In my opinion, this is much cleaner. Now you don't have binary files to keep track of and fewer assembly references in your test project.

Saturday, April 17, 2010

Browsing the GAC

I learned a long time ago that there was a way to disable the Windows shell extension that puts a pretty but often annoying face over C:\windows\assembly\. It's nice to be able to drag assemblies to it or do a right-click/delete, but it's irritating not to be able to treat it like a regular folder when you want to.

The standard way to disable the facade is to flip a bit in the registry: in HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Fusion, set "DisableCacheViewer" to dword:00000010. You can flip it back to all zeroes if you want to turn it back on.

I happened to be doing a bit of digging to find this information again for the 20th time and found the first comment on this blog post here, which says that if you type C:\windows\assembly\gac into the Run box, it will disable the shell extension for that Explorer window and let you move around the GAC folders. Very nice. It works with other sub-folders of assembly as well, so you can browse straight to gac_msil if you like. It also works from the Win7 start menu smart-search text box, but it won't work if you type it into an Explorer path bar.

The standard warning applies: don't move stuff around in the GAC or add or delete things manually. The standard explorer view is best for being able to quickly copy assemblies out.

How I Came Back to OneNote

Around the time I was in high school I started paying more attention to the ways I try to organize information and "present" it to myself. Since then, I've been constantly experimenting and revising the way I take notes and keep myself organized. I just went through a flurry of activity in the last few months and wanted to share about it.

I have found my way to OneNote a few times since its release, and I have always ended up letting it go. I spend a month or two sorting my thoughts into neat hierarchies of notebooks, tabs and pages, diligently typing in meeting notes and other thoughts, and at first the organization seems to make so much sense. Unlike some people, though, my problem isn't that my will to organize everything falls apart. My problem has always been that my notebooks become big black holes of information: I put lots of great things there and then never look at them again. I would always play along with the multiple levels of nesting and hierarchically organize everything, and the information gets buried and a lot of links between important clusters of notes would get lost. I'm not talking about hyperlinks, I'm talking about the importance of mentally associating multiple items with each other.

When my brain finally turned out the preceding sentence a month or two ago while I was trying to figure out how to organize myself, I thought, "clearly I learn and store information visually/spatially, so why aren't I taking notes that way?" I started using Post-Its, but quickly came to hate their size and unneeded stickiness. That weekend, while I was at the store, I picked up some colored notecards and erasers and put together a Hipster PDA. I loved it and my new ability to spread out and rearrange notecards, for about a month: My handwriting is decent, but the keyboard is a more elegant weapon for a more civilized age. I've gotten so used to almost being able keep up with my brain dumps by typing that handwriting feels glacial by comparison. Plus, by moving back to pencil and paper, I was losing out on so many useful tools that are so obvious on a computer.

My next stop was Win7's sticky notes. They're simple but they're not reliable, they're limited to your desktop and they offer very little useful functionality. This lasted about three days, but it got me wondering what kind of spatial organization tools were available in OneNote. I did a little searching and bumped into OneNote Canvas, which immediately looked familiar because I've seen demonstrations done with pptPlex before. I got as far as installing it and getting to the screen where it recommended not using it if you share your notebooks via fileshares (sorry Canvas, but you lost me there). Going back to look at the features, I realized it probably wasn't that great of a fit anyways - the focus was still on pages, as if one was writing in a Word doc and just wanted to clump some pages together. I wanted to get away from that - I wanted to clump ideas together and make things easy to see. I looked at the remainder of the digital sticky notes on my desktop and thought, "why can't I just use OneNote this way? What if I just shut off the obsessive-compulsive instinct to separate everything into notebooks, tabs, pages, paragraphs and bullet lists and just started with a single blank page?"

I fired up OneNote, and I don't think I've closed it since. I think my "note cloud" (this blog post serves as dated proof of prior art of that term and concept if I ever get to patent it!) is here to stay.

The main attraction is my note cloud, a single page that has note containers and a few other types of information scattered all over it. All notes "start" here - if I go to a meeting, have an idea, or someone comes in my office and I need to write something down, it goes in the note cloud. Over time, if I end up having a few scattered blocks that are related, I can just drag them together to associate them. If a block starts to take on content and structure, I'll move it to a new page and replace it in the note cloud with a hyperlink to that page (in OneNote, you can right-click any notebook, tab, page or even individual paragraph and generate a link to it that gets sent to the clipboard). I frequently zoom with Ctrl+MouseWheel to get a better view, so I'll change the font size, weight, highlight color or text color of important things or tag them with a start or something else appropriate. This is actually where I thought of the "note cloud" name - the different font sizes and weights interspersed look a lot like a tag cloud. I don't have a tablet PC, but I can use the drawing tools with the mouse to scribble a broad highlight or circle a few things. The familiar Ctrl-F is "find on this page" in OneNote and I make use of it frequently.

I look at my note cloud constantly and I'm always writing stuff there - it's almost like a desk blotter that I can scribble notes on. I don't have to worry about losing or forgetting about important notes because I'm looking at them all the time, and in the case of a big cluster of notes that I don't need all upfront, I instead have a single link to another page that will take me straight to the content. I've also got another page right next to the cloud, "note cloud trash", where I'll paste old notes cut from the cloud that I might want to have around for a bit to remind me if I did or didn't do something. As this page grows, I'll probably burn the old brush out every once in a while.

So far, I've only mentioned the most basic features in OneNote. I doubt there's anyone that uses them all of OneNote's features (there are zillions of really important features that aren't ribbon buttons or context menu items, but things that "just work," a lot of them having to do with Office suite integration and drag/paste support), but there are a few I've come to really love that have basically turned OneNote into a "smart desktop" for me.

The first is that OneNote is starting to replace my web favorites/bookmarks for research-related tasks. Bookmark titles and folder organization just don't cut it for me for anything except general favorites. With OneNote, I can paste in URLs or drag favicons in from the browser and clump a bunch of links together with other text to associate all of them. Never again will I put a web bookmark on my desktop, lose a bookmark for a specific topic, or wonder if I bookmarked something or not - I have developed a new reflex that automatically pastes URLs of interesting pages into OneNote. This is the new school of research and gathering sources: I remember thinking that gathering and citing sources was such a chore in school, but with OneNote I've got a list of seven or eight interesting links just for this blog post (and let's be honest, this post isn't even that interesting). Amassing links for later perusal and comprehension (or printing for take-home reading, which I've become a big fan of) is incredibly easy.

Another organizational trouble spot I've had since I embarked on my career and was responsible for "action items" is how to keep track of requests sent to me via email, and how to remember to stay on top of people that haven't yet completed requests I have made of them. For the latter, I used to have a set of "waiting on" pages in OneNote, but like everything else they got subdivided into oblivion and buried. I learned firsthand that mail folders are meant for archive organization when I tried to work with a "Later" folder for stuff I didn't want in my Inbox. I summarily renamed that folder "Never" right before trashing it. I never used it because I knew I'd never look in it. Why did I have so many places (mail folders, notes pages, bookmark lists) to look for stuff I wanted right in front of me all the time?

My favorite new trick is to drag emails from Outlook into OneNote and select "insert a copy of the file onto the page". This gives you a bite-sized email icon with the subject line right on the page, and you can double-click it to open it. This has allowed me to do something I have wanted to do for ages, which is to stop using my Outlook folders, including my (now empty!) inbox, for information that I want to have in front of me. I can drag multiple emails into a little group with some text notes to associate them all. If I have an email associated with a task and I complete that task, I can just double-click the email and Reply to it. If I am waiting on someone else to complete something, I can put my request email next to the note and double click it if I want to send a reminder. If a lot of emails come through for a given task I'll replace the one in my OneNote with the newest, so I can see the full thread. The object in OneNote is a copy of the email, so I can sort the original however I like, and if I really need to find it in my mail folder I can use information from the copy to search for it. I've also started using this feature to keep a little collection of emails that might benefit me during the next performance review - I hate keeping copies of emails in my mail folders but I want to sort emails appropriately based on project or task, so I keep the copies in OneNote, along with any other notes or information I can use at review time.

If I ever get around to figuring out how to file feature requests for Office products, I'd love to see first-class support for spatial note taking like this. Even if that doesn't happen, though, I have a few ideas that would help out my workflow and make it even more effective:
• Make zooming and panning a little easier. Zooming is really flaky around the edges of a page and doesn't like to stay centered on the cursor, and zoom settings are used application-wide, not per page. Panning by holding middle-click seems to amplify the mouse sensitivity. Mine is already very high and when I try to pan this way the viewport jumps all over.
• Provide a per-note-container option to make an outline around the note cluster. This would help to automatically visually separate notes that are close together but aren't really associated.
• Some way to link to (as opposed to copy) a whole conversation-threaded Outlook 2010 email thread
• OneNote has a bug where if you drag a favicon from a page with a long URL into it, the text of the entire URL will get pasted, but only about 100 characters of it will become an active link, and the link target will only include those characters (i.e. you have a broken link and you need to recopy the URL to fix it). The workaround is to copy the URL from the address bar and paste it instead of dragging the favicon. Edit: This is incorrect; copying and pasting the URL runs into the same problem, and it's very annoying. I'm using OneNote 2010 Beta but I'll have to see if it's fix in RTM.

We'll see how long this strategy lives for, but I think it's going to stick around for quite a while. The nice thing about it is that you can scale it: if you have tons of discrete things to keep track of, you can still make pages or sections or whole notebooks for them and use your cloud as an index/table of contents with links to stuff so it's still visually in front of you at all times.

My favorite part out of all of this is that I've used the verb "associate" a couple times in this post, but I'm not referring to a feature of OneNote or another piece of software - I'm referring to a feature of the human brain. With my note cloud, everything is right in front of me. By using the simple features of the tool to organize information spatially, I can rely on my brain's cognitive map to make connections and remember things. I finally feel comfortable doing a bunch of work and then dropping it for a while because I know I'm going to be able to pick it up again later. I think this is about as close to a mind/machine merge as I'm going to get until someone releases some new hardware.

Wednesday, April 14, 2010

Motivation

I'm not going to add a big preface to this video, except to say that it's one of my favorite TED talks (if you aren't familiar with TED, go get a week's worth of food and water and click that link, and don't forget to remind your loved ones you're still alive). This is the author Dan Pink talking about what motivates people, particularly in the workplace.

I haven't read Mr. Pink's book related to this video, Drive, but as someone who is pretty dialed into his career and loves learning, I really identify with the message he delivers in this talk. One of my uber-managers, about four levels above me, gives a talk occasionally about performance. He puts a big emphasis on thinking about and understanding what you really want out of your career, whether it's money, respect, recognition, whatever. Especially for someone like me, "autonomy, mastery and purpose" is a tough list to beat. "The freedom to become better at doing things that make a difference."

I really like finding the right words to convey an idea, and those three in particular really shine on their own. What's interesting to me is that they work so well together, but the way one achieves each of those items in the workplace is different. With some exceptions, autonomy is something granted to you based on where you work and who you work for (not to say you don't have any control over that). Mastery is something you have to earn through study, practice and introspection. Purpose is something you have to find - you can't ever really have purpose given to you, only suggested. It doesn't become your purpose until you internalize and embrace it.

If I was in charge of a large corporation, I would not only be interested in granting my employees these three things, but I would be looking to hire people that truly desired them. Mr. Pink says that most everybody wants these things, but I have a feeling that the people who want them the most are the ones you want working for you.

Thanks to Lessons of Failure for reminding me about this video. Great post too!

TFS Labels

I've been spending some time reading up on TFS and how branching, labeling and some of the other more advanced features work in an effort to concoct a new code management process for my team. Our current process is nice and simple, but it's a little bit too simple when we're trying to maintain a clean build for test, introduce fixes that require multiple developers or breaking changes, and think about/work on the next version all at the same time.

In a few future posts I'll talk about some of my findings regarding branching (virtually all of them come from the TFS Branching Guidance, which I cannot recommend enough), but I wanted to devote a couple paragraphs to what I have learned about labels in TFS.

Prior to today, whenever I heard "source control label", I thought of it in the most simplistic sense, i.e. the way it was implemented in Visual SourceSafe. Point at a folder, create a label name, and poof - you have a snapshot-in-time of the codebase. If you load up the history for a file, you can clearly see which revisions were labeled and what the label name was.

Labels in TFS do not work this way. The biggest tip-off to this I had was this blog post, but there's no analogy or walkthrough/instruction in there that shows you what he means. I want to provide both of those things here.

The analogy: Applying tags to photos or music. Most modern personal media-organization applications have functionality that lets you "tag" individual items and then view a list of all items that have a specific tag. Windows Live Photo Gallery is a good example:

You can create a new tag at any time and apply it to as many images as you want. You are always free to remove the tag later, individual images can have multiple tags on them, and deleting a tag doesn't delete the tagged images, it just removes that tag from all images that have it. On the main WLPG screen, you can click a tag name in the list of tag names and it will show you only images with that tag applied.

TFS labels use the exact same concept, but with a twist that makes them way more powerful than the tags on your photo album. When you apply a label, you aren't applying it to a file, you are applying it to a version of a file.

Is this a big deal? It depends on whether or not you make use of it. You can definitely continue to use labels in much the same way as you can in VSS, simply tagging a whole tree with a label at a given point in time. You can also get more fancy with them, using them to form collection of specific revisions of files.

Labels come with a big warning though - labels themselves are not version controlled, meaning that there is no way to track or audit the history of activity on a label. Plus, labels don't keep copies of versions in case of a file deletion, so if a file gets deleted, any label that relies on a version of that file is essentially hosed. These reasons are why it makes more sense to branch for major releases instead of labeling them.

Here are a couple good resources on TFS labels:

Why TFS labels aren't like SourceSafe labels
Virtues and Pitfalls of the TFS Label
Also, TFS 2010 has rolled an important feature from the TFS Power Tools into the product: the ability to rollback to a label and have it show up as a pending change type of "rollback." Rollback or Undo a Changeset in TFS 2010 Version Control. The only caveat is that it's only available from the command line.

Thursday, April 8, 2010

TFS Unshelve and Merge

EDIT: I was wrong: you can manually merge changes. All you have to do is click Resolve instead of Auto-Merge, leave the default settings on the new dialog, and click OK. The text doesn't make it obvious but this takes you to the manual merge screen.

I had a situation yesterday where I needed to unshelve and merge multiple shelvesets in TFS that had a few
common files between them. I was surprised to find that after unshelving the first one, I was unable to unshelve the others. It turns out that TFS does not support merging on unshelve, so when it sees one of the files in the set marked as checked out, it rejects the whole unshelve operation.

I thought I was stuck until I found this blog post that outlines how to do it with TFS Power Toys, which I fortunatly already have installed. As far as I can tell, you can only auto-merge with this feature, so it's a bit limited, but it's better than nothing. It also unfortunately requires some command-line activity, but just a tiny bit. Also, contrary to the linked post's advice, copying tfpt.exe to the mapped solution directory actually caused it to fail, as it couldn't find some dependent assemblies. Instead, I just got the full path to the .exe (on my 64-bit machine, "C:\Program Files (x86)\Microsoft Team Foundation Server 2008 Power Tools\TFPT.exe"), and ran that while in the solution directory at the command line.

Looks like I need to talk to my team and manager about implementing a branching strategy, because I think we're trying to use shelvesets to do too much.

Wednesday, April 7, 2010

Moles: Where's MAssembly?

Unfortunately, due to the special nature of Moles, some types in mscorlib simply can’t be moled. System.Reflection.Assembly is one of the most visible ones. The Moles team has stated that there’s no workaround, and things aren’t probably going to change in this regard. I imagine that trying to mix Moles’ IL-rewriting trick with advanced uses of reflection and a few other things in the framework is black magic just a shade too dark to be safe =)

Moles and InternalsVisibleTo

The InternalsVisibleTo assembly-level attribute is a pretty cool little bit of wizardry. By adding one (typically to your AssemblyInfo.cs file), you can declare a “friend” assembly that gets access to all types and members marked “internal.” If you’ve ever right clicked an internal type or member in VS and selected “Create Unit Test,” you know that VS will generate this for you. It’s great for unit testing – while some people believe that you should only unit test the external interfaces of your types, I don’t see any reason not to break it down further when you can.

Moles assemblies can take advantage of InternalsVisibleTo as well, but they need their own InternalsVisibleTo attribute in your assembly specifically for them. Fortunately, it’s not too tough.

For unsigned assemblies, giving Moles access to the internals is as simple as creating an InternalsVisibleTo attribute in your AssemblyInfo pointing to the Moles assembly name:

[assembly: InternalsVisibleTo(“MyProject.MyClasses.Moles”)]

An assembly can have more than one InternalsVisibleTo attribute, so you can point one to your test project and one to the Moles assembly.

If your assembly is signed, there are a couple other hoops to jump through. Information floating around on the Internet confuses things a bit, but the fact of the matter is that if the assembly you want to increase visibility to is signed, the friend assembly has to be signed as well, and the InternalsVisibleTo attribute has to include the public key of the friend assembly. Not the public key token, but the entire public key. Unfortunately, the IntelliSense documentation for InternalsVisibleTo has absolutely no reference to this and does not show how to format things properly. The correct format is like so (with an example of a real public key, so as not to confuse you with brackets and other things, and to show you just how long they are):

[assembly: InternalsVisibleTo(“MyProject.MyClasses.MyFriendAssembly, PublicKey=0024000004800000940000000602000000240000525341310004000001000100b3a1f96b163680bfd8f36cd4a2ee94accb680d97d407ed8898abc1710662205878d27c138902bd3a08a19c6b2afacdf95a4de2cbd3b6fd2cc18e08540ad79cdea4ab5e81ea3c8191fbfdb72efa90c6d750224ae05fae6ab9fe1ebc6958423dbb644a36c7019dc8388e925a802e33d1902ed293fd0f420a1dcb4e135cff7ee8c5”)]

But how do you sign a Moles assembly, and how do you get the public key from it? It’s actually pretty simple. If you mole a signed assembly, the Moles assembly will be signed with a key packaged up in the framework and used specifically for this purpose. To get the key, first generate the Moles assembly without trying to use InternalsVisibleTo. Once you’ve got it, open up a Visual Studio command prompt, and do (case sensitive) “sn –Tp MyProject.MyClasses.Moles.dll”. This will spit out the public key, which you can use to create the InternalsVisibleTo attribute on your assembly. Finally, rebuild both projects to regenerate the Moles assembly, and you’re good to go.

Moling Constructors and Other Instance Functionality

Last time, we saw how to spin up a new mole instance and hand it in for use as part of a test. What about the case where you don’t even get a chance to inject the dependency – where the class under test just news it right up without asking anyone?

There are two ways to beat this – constructor moling and AllInstances. First I’m going to talk about constructor moling, then AllInstances, and then I’m going to give a little bit of information as to how this all works. I don’t know about you but sometimes a little bit of understanding about what’s going on behind the scenes really helps me to perceive how to use a particular tool.

Let’s resurrect Example from the last post, but without the possibility for constructor injection.

public class Example
{
    MyDependency m_dependency;

    public Example()
    {
        m_dependency = new MyDependency(10);
    }

    public string Execute()
    {
        return "Dependency string is: " + m_dependency.MyDependencyMethod();
    }
}

public class MyDependency
{
    int m_dependencyInt;

    public MyDependency(int myInt)
    {
        m_dependencyInt = myInt;
    }

    public string MyDependencyMethod()
    {
        return "MyInt times 2 = " + m_dependencyInt*2;
    }
}

Code like the above is written all the time, and it’s the wrong way to do things (I’m growing as a developer, so I myself didn’t really realize why until recently). Whoever uses Example never gets to say anything about the code it depends on, and if the code it depends on is important enough to be in its own class, that’s probably a bad thing. Maybe in the domain this code is written for, it might make sense, but even then this code can’t be unit tested… unless we bring in Moles!

Since we never get a chance to drop in our own MyDependency or a mole of it, what can we do? One option is to change what “new MyDependency(10)” does. Witness the power:

[TestMethod()]
[HostType("Moles")]
public void ExecuteTest()
{
    MMyDependency.ConstructorInt32 = (inst, intArg) =>
    {
        new MMyDependency(inst)
        {
            MyDependencyMethod = () => "Moled! Oh yeah, and the int you gave me was " + intArg
        };
    };
    Example target = new Example();
    string actual = target.Execute();
}

Yow, what’s going on here? When you mole a type that can be instantiated, each constructor for the type becomes a static property on the mole type. The type of the property is an Action<>, which you should recognize as a delegate that takes parameters of the types inside the angle brackets and returns nothing. The first type in XYZ is always the type being moled (whaaaaaa….?), and the rest are determined by the parameters the constructor takes. The constructor we mole takes a single parameter of type Int32, so the type MMyDependency.ConstructorInt32 is Action<MyDependency, int>. But… an Action doesn’t return anything, so how could this possibly be a constructor? And what is “inst”? I’ll explain later in this post.

The body of the lambda is essentially how every moled constructor will look: you create an instance of the moled type using the constructor that takes an instance of the type being moled (see how we pass in inst when newing up MMyDependency?), and you mole the methods on that instance by setting the appropriate properties. Constructor moling is the only place you use the mole type’s constructor overload that takes a parameter, and you always use this constructor overload when moling a constructor.

So what’s the other way to control what happens when Example calls MyDependencyMethod? The MMyDependencyMethod has a public static class nested inside of it called AllInstances that, like a mole instances, has properties corresponding to each instance method. The difference between the delegate types of these properties and the delegate types on a mole instance’s properties is that these delegate types have an extra parameter tacked on to the front of the type of the class being moled. You don’t have to do anything with it, but you still have to declare it. Here’s an example of moling MyDependencyMethod on all instances of MyDependency that get created in the scope of this test:

[TestMethod()]
[HostType("Moles")]
public void ExecuteTest()
{
    MMyDependency.AllInstances.MyDependencyMethod = (inst) => "Moled!";

    Example target = new Example();
    string actual = target.Execute();
}

So now a little bit about what’s going on behind the scenes. I don’t have a whole lot of information on _Detours, the API that sits underneath Moles that actually manages registration of method detours, but I do know that it operates with a few simple method calls that that take in a specific instance and information about the method being detoured. When you generate a mole for a class that can be instantiated, what you get is a MoleBase<T>, where T is the type being moled. This class is actually a temporary “holster” for an instance of type T that gives you access to those _Detours methods via simple properties. When you create a new mole instance from scratch in your test, _Detours uses a neat trick to instantiate a “raw” instance of type T that is completely uninitialized – no constructors are run and even type initializers aren’t run – and “holsters” it. When you set moles on methods via properties on the MoleBase<T>, you are calling through to _Detours to detour methods on that instance. Finally, when you pass in the mole instance to a constructor or whatever else uses it as a dependency, the MoleBase<T> just hands back the nested instance of T (this is done with an override to the “implicit conversion” operator). When your tested code runs, it depends on a completely uninitialized instance of T that’s been registered in _Detours.

This also explains the need for the first parameter of type T that you are required to have when you mole a constructor. If you look closely, the delegate required when you mole a constructor is an Action<>: it doesn’t return anything. What Moles does for you is create a “raw” instance of T, hand it to your delegate so you can use the Moles API to register detours, and then that registered instance is returned to your code under test.

Pretty cool huh?

This ends my deep dive into Moles. I could go deeper but I’m starting to get away from the practical and into “Moles theory.” In reality, any test code trying to do something really fancy with the Moles API in the course of trying to get real, productive work done is probably abusing it – what you should be doing is trying to do the simplest thing that will let you unit test your code in a practical way. Moles is pretty interesting to play around with but I need to move on to more practical topics. I do have a few more posts about Moles in particular – a couple real-world examples from my work including a case where I needed to mole methods on a type as well as one of the type’s ancestors, working around a few tricky bits, the notable absence of moles for a few important types in mscorlib, and how to use “friend assemblies” with moles – but after that I’m moving on to explore Pex in greater detail.

Mole Instances and Moling Instance Methods

So far, we’ve seen how to mole static methods and how to stub interfaces and abstract and virtual methods. What’s left? Moling instance methods that can’t be overridden by following the rules. Moles aren’t just for static methods – you can also create mole instances that override methods that normally can’t be overridden and swap them in for the real thing.

Here’s the context for today’s post:

public class Example
{
    MyDependency m_dependency;

    public Example()
        : this(new MyDependency(10))
    {
    }

    public Example(MyDependency dependency)
    {
        m_dependency = dependency;
    }

    public string Execute()
    {
        return "Dependency string is: " + m_dependency.MyDependencyMethod();
    }
}

public class MyDependency
{
    int m_dependencyInt;

    public MyDependency(int myInt)
    {
        m_dependencyInt = myInt;
    }

    public string MyDependencyMethod()
    {
        return "MyInt times 2 = " + m_dependencyInt * 2;
    }
}

Here we’ve got a class, Example, that takes a dependency of a specific type into its constructor. The type specified isn’t an interface, but a concrete, non-abstract type. It’s not sealed (although it could be for purposes of this example), but it doesn’t have any virtual methods, so it can’t be replaced with a stub. Something like this could happen in practice when a developer gets a bright idea about constructor dependency injection, but never ends up finding a need to abstract the type of the dependency out to an interface (in real life, what’s more likely is that the developer would just new up an instance of the dependency right within the class – I’ll get to that soon). We’d like to test Example.Execute isolated from its dependencies, so we create a mole instance in our test and hand that in to the Example constructor instead.

[TestMethod()]
[HostType("Moles")]
public void ExecuteTest()
{
    MMyDependency mole = new MMyDependency
    {
        MyDependencyMethod = () => "Moled method!";
    };
    Example target = new Example(mole);
    string actual = target.Execute();
}

I gave MyDependency a little bit of state to give you something to think about – notice that the moled version of MyDependency has no concept of state. When you mole a class, it loses its “classiness” and becomes nothing more than a collection of methods. If you want to give a mole state, you can store that state within variables inside the test method, and then refer to those variables from within your lambda expressions. This works because of closures (warning: that link goes to a piece written by Jon Skeet, meaning that if you read it and keep exploring his other stuff, you will lose most of your day), one of the most beautiful ideas in all of programmingdom. However, think hard before doing this – it’s likely to be way more than you actually need for purposes of your test. One thing you can do, if you want to put the work in, is use state to make a mole or a stub into a mock, with expectations about what methods will be called and with what parameters.

What if your class under test doesn’t let you inject the dependency? Check the next post for information about how to mole constructors, use a mole type to affect all calls to a given instance method on all instances, and a little bit of insight as to how moles work.

Tuesday, April 6, 2010

Why Use Stubs?

If the code under test is nicely abstracted out, with dependencies represented as interfaces that can be injected, why use stubs generated by Pex/Moles and not just implement and derive test doubles yourself? Because it makes a mess! You end up spinning up new classes for every test to customize the behavior and get what you want. Using stubs generated by moles, you get this nice generic “harness” that you can plug lambda expressions into inline, right in your unit test.

One question I’ve seen on the community forums a couple times is, to paraphrase, “why would I ever use stubs?” Moles are powerful, easy, expressive and global. Why screw around with other tools? The answers in those threads are pretty good, but I’d like to expand on them a little bit. There are three “technical” reasons and one “religious” reason (not to imply it’s only presented by zealots – I just wanted to discriminate between reasons that are scientific in nature and those that are more design-centric).

The easy first reason is “performance”. Moles do runtime code rewriting, whereas stubs are simple derivatives of classes, so moles add a lot of overhead. Additionally, if you’re using Moles, you have to use the Moles host, which also adds overhead. Stubs do not require it. With all that said, I’m not really convinced that this is a solid reason to prefer stubs over moles unless you’re doing a lot of stubbing/moling in a lot of test cases – unit tests will still only take a second or two to run otherwise.

The next, more interesting reason is that there is one important thing stubs can do that moles cannot: provide implementations for interfaces and abstract methods. Moles require that method implementations exist in the first place so you can detour them. If there’s no implementation, you need to create one with a stub.

The last technically-oriented reason is related to deployment. You can use stubs simply by GACing or “binning” the Moles framework assemblies, but using Moles requires that the Moles installer has been run on the machine to install and register the rewriting CLR profiler. If you have a build machine you’re running tests on and you have limitations as to how you can modify it, you may want to avoid using moles.

The “religious” reason is that you should use the simplest possible thing that does the job. If a stub can do the job, use a stub, unless it makes lots of hoops to jump through and using a mole results in fewer hoops. Stubs may use code generation, but they don’t use “magic” and they are built on standard object-oriented principles that are easy to explain. They represent a process that you can easily implement without them (by spinning up new implementations of interfaces and derivations of abstract classes), but they result in less code and easier to read code that’s all in one place.

An interesting corollary: if you find yourself using a lot of moles in a given unit test, it likely means that your code is very tightly coupled to a lot of external dependencies. Perhaps you should consider refactoring a bit so that stubs are a more realistic option.

Stubs!

Stubs are the boring, old-fashioned cousins of moles. They don’t break any rules and don’t do anything you couldn’t technically do yourself. However, this doesn’t mean that they don’t have their place. I’m going to be talking about stubs in this post, largely because I’ve used the word “mole” so much in the last few days that it’s starting to lose meaning!

If you look back to the post about rules for generating moles and stubs, you’ll see that you can only generate stubs for interfaces and for classes that can be derived from and instantiated: this means no sealed classes and no static classes. A stub type by itself is useless – you have to create an instance of it, and then set delegates for methods and set properties on that instance for it to be useful. You can then hand that stub in as a parameter to a constructor for a class under test, or as a parameter for a method under test.

The basic behavior of a stub when you call a method on it is as follows:

1. If a custom delegate has been set for the method, that’s what gets called.
2. If not, and if the CallBase property on the stub is set to true, the base implementation will be called. This only applies to virtual methods that your stub overrides: abstract method declarations and method declarations in interfaces have no implementation.
3. If neither of the above applies, the Behavior of the instance is used to determine the course of action.

Like moles, stubs have behaviors too. They are in the BehavedBehaviors class and there are only a couple: DefaultValue and NotImplemented. These behave just like the behaviors with the same names on moles. Unlike static methods on mole types, the default behavior for a stub instance is NotImplemented, so if you forget to mole an instance method that gets called, your test will fail with an exception. Also, if you want, you can set the BehavedBehaviors.Current property to a given behavior to globally set behaviors for all stubs whose behavior hasn’t been explicitly set.

Stubs seem a lot less powerful than moles, so why would you ever use them? See the next post!

Moling Static Methods

Arguably, the simplest types of moles are those on static methods. They’re nice and easy because you don’t have to worry about handling behaviors and implementations on specific instances, and the way things work is pretty cut-and-dried: calls to methods that you mole will use your overriding implementation, and calls to methods that you don’t mole will call the real implementation unless you give it something else to do. Here’s my contrived example for this post (I promise I’ll start using more real world examples soon, but for now, it’s easier to demonstrate the behavior of moles this way):

namespace MyProjectToTest
{
    public static class MyStaticClass
    {
        public static int MyStaticAdder(int a, int b)
        {
            return a + b;
        }

        public static int MyStaticSubtractor(int a, int b)
        {
            return a - b;
        }
    }

    public class ClassToTest
    {
        public int MethodToTest(int a, int b)
        {
            return MyStaticClass.MyStaticAdder(a, b) * MyStaticClass.MyStaticSubtractor(a, b);
        }
    }
}

If I set up a test project for this assembly and mole it, here’s what IntelliSense shows me in the Moles namespace:

We got a mole for the static class and a mole and a stub for the other one. This is in line with the rules I posted previously – our static class can’t be instantiated, so it only gets a mole. ClassToTest is a non-abstract, non-sealed class, so it gets a stub and a mole. Let’s look at MMyStaticClass:

Let’s ignore the methods and the Behavior property for the moment. On our mole type, we have static properties named after static methods on MyStaticClass. If you look at the types of the properties, they are both Func delegates, which match the signature of the methods being moled. If you assign delegates to these properties, your delegates will be run instead of the real implementation whenever the method is called. That right there is the magic of moles.

Let’s take a look at an example of how to do this:

[TestMethod()]
[HostType("Moles")]
public void MyTest()
{
    ClassToTest instance = new ClassToTest();
    int result = instance.MethodToTest(8, 4);  // (8+4)*(8-4) = 48
    MMyStaticClass.MyStaticAdderInt32Int32 = (a, b) => a / b; //overriding the static method
    result = instance.MethodToTest(8, 4); // (8/4)*(8-4) = 8. Note we didn't mole MyStaticSubtractor.
}

In the test, we first run MethodToTest, and we can see if we look in the debugger that it returns the expected result, 48. Next, we detour the MyStaticAdder method to perform a division instead. All we have to do is change that static property on the MMyStaticClass type, and the method is detoured – we don’t “new” anything, and by using lambda expressions, our code is short and neat and we don’t even have to do any work outside of the test method! The next calculation makes use of our detour instead of the real implementation. Try running it through the debugger to see what happens. This is a pretty contrived example, but start thinking in terms of classes and dependencies: if you are testing a class that has an ugly, not-abstracted-away hard dependency on a static method you don’t want called during your tests, like File.ReadAllText or Directory.Exists, you can just mole it away (there is a little caveat about moling File and some other BCL methods that I’ll talk about later, but compiler errors will show you what to do if you run into it).

At this point, you may be thinking, “man, I certainly have to know a lot about the source code I’m working with to know what needs to be moled/stubbed/etc.” You would be correct – if you look at the Pex homepage, you’ll see the term “white box testing” right there in the title. White-box testing implies that the implementation under test is fully visible to you, and you just need to verify that it works correctly.

There is a technique you can use, suggested by the documentation, to help you find methods that need to be moled – get all the unmoled methods on a type to throw exceptions if you try to call them! Moles and stubs all have a Behavior property that you can set that will determine what happens when you call methods that you didn’t mole.

If you look in the Microsoft.Moles.Framework.MoleBehaviors class, there are a few properties that return instances of IMoleBehavior. The useful ones are Fallthrough, DefaultValue (“default” here refers to the functionality – this is not the default behavior) and NotImplemented. Fallthrough is the behavior we’ve seen so far – call the real implementation if no mole has been set. In the example above, we didn’t mole MyStaticSubtractor, so the real method still got called. Note that FallThrough is only the default for static methods; I will discuss instance methods in a later post. DefaultValue replaces not-explicitly-moled methods with a stub that simply returns the default value of the return type – it’s a quick way to say “I don’t care what any of these methods that I’m not writing replacements for do, but I don’t want them to throw exceptions and I don’t want them to call the real methods”. NotImplemented will cause a MoleNotImplementedException to be thrown if a method is called on the class that isn’t explicitly moled (this is the default for instance methods).

This last behavior is great for safely figuring out what needs to be moled in order to make your test successful. Usage of it is expected to be so common that there’s a shortcut for setting it as the behavior a mole: MMyMoledclass.BehaveAsNotImplemented(). Some of the official moles documentation suggests doing this for all of your moles and running your test method repeatedly, moling each method that throws an exception, until your test stops throwing exceptions. This way, you can make sure you’ve replaced all the functionality you need to for a given test.
Next up: stubs!

Thursday, April 1, 2010

What Did Adding a .Moles File To My Project Do?

Obviously, adding a .moles file to your project added moles and stubs of the types in the targeted assembly. However, sometimes it can be a little difficult to figure out why certain classes got generated, where they are, and exactly what you’re supposed to do with them.

For starters, all moles and stubs end up in namespaces that are the same as the namespaces of the originating types with a “.Moles” slapped on the end. Types in “MyProject.MyClasses” have moles and stubs in “MyProject.MyClasses.Moles.”

IntelliSense your way into those namespaces and you’ll see a bunch of classes starting with “M” and “S.” I’ll let you guess what those stand for.

The basic rules for generating moles and stubs look like this:

1. Interfaces only get stubs generated for them.

2. Classes that can be derived from and instantiated get stubs generated for them, although if they have no abstract or virtual methods, you are limited in what you can do with it. This does not necessarily prevent them from having moles generated for them too, if they meet the criteria.

3. Pretty much everything except interfaces get moles generated for them. This includes, but isn’t limited to, things that can’t normally be extended/derived, such as classes without public constructors, sealed classes, static classes, classes with only non-virtual and/or non-abstract methods get moles generated for them. This does not prevent them from having stubs generated for them too, if they meet the criteria.

There are more rules that get into more complex stuff, but this should server you for 97% of everything, especially when you ask (as I often did at first), “why did this class get a mole but not a stub?”

The reason that these rules exist is because stubs follow the standard CLR rules – they can only override some piece of functionality by actually overriding it via inheritance or interface implementation. For this reason, they don’t require the [HostType(“Moles”] attribute on tests that use them. If you look at generated stub types, they derive from/implement the stubbed class/interface. Moles, on the other hand, throw CLR rules out the window. They rewrite the IL at runtime, so they can do pretty much anything.

If moles don’t inherit or derive from a class, how can mole instances be substituted for a class at compile time? Moles actually derive from a MoleBase in the Moles framework and are given an implicit conversion to the moled type. Moles of static methods don’t have to do this – they just derive from MoleBase, and the Moles host intercepts calls to the static method at runtime and reroutes them (the technical term is “detour”) to the mole implementation.

Wednesday, March 31, 2010

How To Get Moles Into Your Test Project

In my last post, I talked about what Moles can do for your testing capabilities. In this one, I’m going to give a quick overview of what you need to do to start using Moles in your project. For this post and most others, I’ll be talking from the viewpoint of Visual Studio 2008 (all of the stuff on the Pex page trumpets VS 2010, but it works fine with 2008) Team Suite, so I’ll be using the VS Team Test unit test tools as opposed to something like NUnit, although Pex and Moles come out of the box with a few switches you can flip to make them friendly to NUnit and a couple other testing frameworks.

Like most things, you’re going to need to install it first. I recommend the full Pex + Moles installer - you can grab it from here. I try to be careful with what I install and avoid installs that make a mess or require a lot of steps, and I can report that the Pex + Moles installer is simple and clean as can be.

Once the installer finishes, fire up VS with your test project. In your test project (as opposed to the project-under-test), do an Add > New Item. You should find a new template called “Moles and Stubs for Testing.” Before you fire away and click OK, there’s a small bit to understand about what adding a .moles file does and what effect the name has.

Running this template will add a .moles file and subfiles to your solution and add two references to your project: one to the newly compiled Moles.dll and another to the Moles framework types A .moles file is a small XML file in your solution that contains configuration for a set of MSBuild tasks that also get added to your project when you add a mole. Most importantly, it points to the assembly that you want to mole. A .moles file only moles a single assembly, so if you want to mole types in multiple assemblies, you’ll need to add multiple moles. The .moles file has three subfiles when viewed in Solution Explorer:

A designer.cs file. To be honest, I have no idea what this does, as it contains nothing but a commented-out Base64 string.
A Moles.xml file. This contains generated documentation for all the mole types generated.
A Moles.dll file. This is a compiled assembly that contains the mole types. When your test project is built, this assembly is generated by MSBuild, and a reference to it is automatically added to your project. If you mole a big assembly, like mscorlib, it can add a good chunk of time to your build process, but it will only rebuild the assembly if the assembly being moled has changed since the last build.

When you add your .moles file, the name you use (minus the .moles part) will be the name of the assembly that it moles by default. It has visibility to everything in the GAC plus everything referenced by the containing project. You can change the assembly by opening the .moles file in an XML editor, but it’s easier to just name it correctly to begin with. One tweaky thing I’ve noticed is that, unlike a lot of other VS template items, you really should explicitly include “.moles” at the end of the filename, otherwise it will get confused if the targeted assembly contains dots in its name.

Just like that, you’re done. You now have access to moles and stubs for all the types in the referenced assembly. Now, what exactly does that mean? Check out the next post for information about how mole and stub types are generated.