Tuesday, July 31, 2007

Refreshing the BRE Facts cache

In my first post I talked a little bit about the FactRetriever object and how it works in concert with a BRE policy. Use of a FactRetriever can help you manage an in-memory cache of a table so that the BRE doesn't have to constantly hit a database for information.

After much debugging, stress, and some research on the part of a coworker, I discovered a crucial bit about how the BRE operates: If a fact (such as a DataTable within a DataSet, or an XML document) is asserted into memory and you wish to refresh that cached version, you must explicitly tell the BRE to retract the fact from memory before asserting it again. Failure to do this will lead to some really strange behavior that will have you debugging in circles.

Depending on your environment and the type of fact you have asserted, there are a couple of ways of doing this. You can call RuleEngine.Clear(), but this halts all BRE execution, cancels all rule firings, and removes all facts from the cache. I'm sure there's a situation where this is desirable, but the odds are that it's not yours. This will clear everything from the BRE's memory.

What you're probably looking for is RuleEngine.Retract() or RuleEngine.RetractByType(). I believe that in the case of a DataTable or DataConnection, these two methods accomplish the same thing, since a DataTable or DataConnection stored in the BRE memory is unique and there can only be one instance of it. Since our goal was to retract a DataTable, we didn't do a lot of research into retracting other types of facts.

Using RetractByType() took a little while to figure out: there is no documentation out there anywhere that we could find, and no extended Intellisense documentation. It took a little experimentation and some creativity to figure out how to do what we wanted. RetractByType() takes a single argument, a FactType object. FactType is abstract, but has a couple of descendants, one of them being DataRowType. DataRowType's constructor takes two strings: the name of a table, and the name of a dataset. These values are what defines a "table type" within the BRE: If you have a table with the Table property set to "ABC" inside a DataSet with the DataSetName property set to "XYZ", that's the only "ABC" "XYZ" table that can be in the BRE's memory at one time. The BRE keys off of those name values when you make references to the table within rules or vocabularies.

In order to create a type identifier so that the BRE can retract your fact for you, all you have to do is create a new DataRowType with the table name and DataSet name correctly specified, and hand that object to RetractByType. The following is an example - this is the first code that runs once the FactRetriever has determined that the cache does indeed need to be refreshed:

if (null != factsHandleIn)
{
engine.RetractByType(new DataRowType("MyTableName", "MyDataSetName"));

}

The reason that I check to see if factsHandleIn is null is because my FactRetriever returns a value as factsHandleOut that gets handed in again as factsHandleIn on the next call. If factsHandleIn is null, that means that this is the first time the FactRetriever has run since restarting its host instance, so there's nothing in memory to Retract anyway. The next thing that

Mixed in with some logic that determines when the cache should be refreshed, you can use this control over retracting/asserting facts to keep an efficient cache in the BRE's memory. Our FactRetriever uses DateTimes and values in the registry to determine if the cache needs to be refreshed - it can be set to refresh automatically on an interval (run the policy and record the time. If the policy runs again and the time span between the last update and now is X seconds, refresh again) or by a user-override (if a user edited the table, he can set a value in the registry to DateTime.Now, and the next time the policy runs it will see a newer value than the last value stored in that registry node and will refresh).

Tuesday, July 24, 2007

How to examine your messages in BizTalk

This tip is more for the BizTalk newbies than the seasoned veterans (a group that I don't include myself in, by the way). If you need to get a good look at a message in BizTalk that's hit the MessageBox, including all of its context properties, simply create a Send Port that subscribes to the message and set it to the "Stopped" state (not "Unenlisted").

BizTalk 2006 send ports have message queues attached to them. The filters that subscribe the send port to messages in the MessageBox actually subscribe the queue to those messages, and the send port then processes messages off of the queue. If you unenlist a send port, it's message queue gets shut down and any subscription information becomes invisible to BizTalk. However, if you only stop the send port, the queue and its subscription remain active. Any messages that reach the queue become suspended and will automatically be resubmitted and subsequently processed if the send port is turned back on.

So, subscribe to your message, stop (not unenlist) the send port, run a message through, and then you'll be able to find your message in the suspended instances view. From there, you can examine the message and it's properties in detail.

Monday, July 23, 2007

Unicode and BizTalk

The BizTalk 2006 disassemblers will choke on UTF16 files that don't have byte order marks. What makes this behavior really strange, especially for users of BizTalk 2002, is that BizTalk 2002 accepted them with or without the marks. It's my understanding that this behavior in 2002 is actually a bug.

If you're trying to figure out why your files aren't jiving with BizTalk, take a look at the encoding and see if that FF FE (or FE FF) is present at the beginning of your files. A custom component in your receive pipeline placed before the disassembler that slips these two bytes in can easily correct the problem.

Debugging pipeline components in Visual Studio

I am not a Jedi of the Visual Studio Debugger, so I thought this was pretty magical when I found out how to do it.

To step through the code of a custom pipeline component in real time in the Visual Studio debugger, simply do the following:
  • Compile the component as Debug. Make sure you've got breakpoints appropriately set.
  • Place both the DLL and the PDB file in the Pipeline Components folder (as with all pipeline components, you don't need to GAC anything).
  • Configure a receive location/send port with a pipeline that uses the custom component.
  • Open the component solution in Visual Studio, and in the Debug menu, select Attach to Process. Attach to the process that represents the host instance your port is running on (BizTalk services are named BTSNTSvc.exe. I don't know of any way to identify which one represents the host instance you want, if you have more than one running).
  • Once attached, trigger the component by running a file through BizTalk.
As soon as the component loads, control will transfer to the debugger. You are now controlling the execution of the component in real time. This is a godsend when trying to figure out exactly how BizTalk is interacting with your custom component.

Why is Load being called twice on my pipeline component?

If you write a custom pipeline component and follow its execution through the debugger, you may be surprised to find that sometimes (depending on the component's configuration), the Load method may be called twice. This is confusing behavior at first until you start looking at the values that are being pulled out of the PropertyBag that's handed in to Load.

If the design-time configuration of your pipeline (the configuration items you can set on the component within the pipeline from the Send Port/Receive Location configuration dialog) consists of only default or only non-default values, Load will only be called once. If you have more than one design-time property and some are set to default values while others have non-default values, Load will be called twice. Each call will have a distinct PropertyBag - the first will contain the default values, and the second will contain the non-default values.

Just to clarify, the "default value" is the one you provide in Visual Studio in the Properties pane when you place the component into a pipeline. When setting values at design-time, default values show up as normal text while non-default values will appear in bold.

Here's the part that can trip up your code if you don't plan for it: The PropertyBag that contains the default values contains nulls for all the non-default values, and vice versa. If your component contains more than one design-time property, make sure that it doesn't choke on nulls, as you are guaranteed to get some. A simple way to do this is to give the component object a Dictionary member or some similar type of object that can contain the design-time property values. In your Load method, include some logic that looks to see if a value already exists in that Dictionary if it finds a null.

Oh, and speaking of default values for pipeline components, you can reset all the values to defaults if you simply select a different pipeline for the Send Port/Receive Location, and then select the original pipeline again. This is the only way I know of resetting the values.

What are FactsHandleIn and FactsHandleOut?

Execution within the BizTalk 2006 BRE follows an interesting paradigm. It took me a while to sort it all out as I'm not familiar at all with rules-based programming or anything of the sort. One of the things that tripped me up the longest was the notion of a FactRetriever.

A FactRetriever is a class that attaches to a policy and is in charge of making sure that when the policy runs, it has all of the facts that it needs in memory. Use of a FactRetriever allows fine control over when and how things are done - instead of just letting the BRE connect to a database, you can craft DataSets however you'd like (build it in memory, call a stored procedure, etc.) and assert those into memory as facts.

The heart of the FactRetriever is the UpdateFacts method, which is called every time the policy it is attached to is run. One of its parameters is object FactsHandleIn, which has a name that doesn't make it too obvious as to what it does. Additionally, UpdateFacts returns object FactsHandleOut. These two objects give you a lot of flexibility in how you'd like to control fact retrieval.

It's this simple: The first time the policy is called, when UpdateFacts is called, FactsHandleIn is null. Whatever you choose to return as FactsHandleOut is passed in as FactsHandleIn when the policy is called again. If the service/host instance/machine/etc. is ever restarted, a null will be passed in again on first calling.

This doesn't sound all that great when you first hear it, but it's a nice way of letting the BRE persist anything you want in memory to be used as a "hint" to the FactRetriever that you want to do something. For example, a great use of the BRE is as a local, in-memory cache of a DataSet. You can control the refreshing of this cache by passing a DateTime around as FactsHandleIn/Out, and using some custom logic in UpdateFacts to determine whether or not the data should be retrieved and re-asserted into memory.

You can use FactsHandleIn/Out for anything. A DateTime is a common use. I imagine that you could think up a great use for just about any kind of objects as a FactsHandle. This is one of those places where I'm sure someday I'll run across a really ingenious use of some strange object to perform a cool task.