Wednesday, June 23, 2010

Cascading Deletes in EF4

Something I was curious about in Entity Framework was how to delete a whole graph of entities by deleting its root. In the DB world, doing this automatically is known as "cascading deletes", and in SQL Server it's something you can specify on a foreign key relationship. When you specify the cascade delete action, when the "parent" row is deleted from the database, dependent rows are silently deleted as well. This avoids running into foreign key constraints when you are deleting rows, but it can also result in deleting more data than you thought you were, so cascading deletes should be used carefully.

Anyways, I was curious as to how this idea was represented in EF, so I fired up a little demo app to try a few things out. After some experimentation, and consulting this excellent blog post, I think I've got it figured out:
  • Specifying cascading deletes on the database is entirely separate from specifying them on the client. By far the safest way to do things if you want cascading deletes is to specify them on the database. If you don't, some odd things can happen (keep reading).
  • To specify cascading deletes on the client, select the relationship in your model. In the properties window, for the end that represents the "1" multiplicity, set OnDelete to Cascade. This tells EF to issue delete requests for dependent entities that have been loaded into memory before deleting the parent.
  • Here's where things get interesting: Say I have a Student entity in memory, and the FK_StudentGrade_Student relationship is set up for cascading deletes on the client, but not on the DB. If I mark the entity for deletion and SaveChanges(), what happens? EF will issue a delete command for the Student after issuing delete requests for all StudentGrades that are loaded. EF does not retrieve all the dependent entities prior to deletion to learn about them so it can issue delete requests for them. If you don't have all the dependent entities loaded, you'll get an FK violation when EF tries to delete the parent.
  • What if you have cascade set up on both the DB and the client? EF will still issue delete requests for the dependents it knows about, which is redundant but harmless. If some dependents weren't loaded, the cascade rule on the DB will take care of them.
  • What if you don't have cascade set up on the client? It depends on whether the parent entity you are trying to delete has any dependents loaded. If it does, EF will stop you with an exception before it even issues the delete request to the DB, because it sees that you are violating a rule of the conceptual model. If the parent doesn't have any dependents loaded, EF will go ahead and issue the delete request. From there, the behavior is determined by whether or not any dependent entities exist in the DB and whether or not you have cascading deletes specified on the server. If the answers are "yes" and "no," respectively, you'll get an FK violation.

Highlights of the Entity Framework's "Working With Objects" Documentation

I'm trying to get my head around Entity Framework 4 and the official MSDN documentation is pretty helpful in explaining the high level concepts. In particular, the "Querying a Conceptual Model" and "Working With Objects" sections are very useful when it comes to learning about how to actually get code that uses EF4 up and running.

I read through "Working With Objects" last night and this is a quick bullet-list of the highlights.

Defining and Managing Relationships
  • If you include foreign key columns in the model, you can create/change relationships by manipulating the foreign key, or by manipulating the navigation properties. This is called a foreign key association.
  • If you don't include foreign key columns, relationships are managed as independent objects in memory. This is called an independent association. With these, the only way to change/create relationships is by manipulating the navigation properties.
  • When a primary key of an entity is part of the primary key of a dependent entity, the relationship is an identifying relationship - the dependent entity cannot exist without the principal entity. In this case, deleting the primary entity also deletes the dependent entity, as does removing the relationship.
  • In a non-identifying relationship, if the model is based on foreign key associations, deleting the principal objects will set foreign keys of dependents to null if they are nullable. If they are not, you must manually delete the dependent objects or assign a new parent, or you can specify in the model to automatically delete the dependent entities.
Creating, Adding, Modifying, and Deleting Objects
  • The CreateObjectName static method on an entity type is created by the EDM tools when the types are generated. The parameters for this object include the columns that cannot be null, and nothing else.
  • If the primary key column is exposed and not nullable (and thus exposed in the signature to CreateObjectName), you can just assign the default value, since the ID used prior to SaveChanges is always temporary.
  • When using POCOs, use CreateObject instead of new.
Attaching and Detaching Objects
  • Detaching objects can help keep the memory requirements of the object context in check. If you execute a query with MergeOption.NoTracking, the returned entities are never attached.
  • Consider creating a new ObjectContext if the scope of data has changed (like if you are displaying a new form) instead of detaching objects from ObjectContext.
Identity Resolution, State Management, and Change Tracking
  • If you are working with POCOs without change-tracking proxies, you must call DetectChanges so the EF can figure out what changes have been made.
  • To find out if the value of a property has changed between calls to SaveChanges, query the collection of changed property names returned by the GetModifiedProperties method.
Saving Changes and Managing Concurrency
  • By default, EF uses "optimistic concurrency," meaning that locks are not held while your app has data from the data source, and object changes are saved to the database without checking for concurrent modifications. If an entity type has a high degree of concurrency, you can set ConcurrencyMode to Fixed, which will cause EF to check for changes in the DB before saving. Conflicting changes will throw an OptimisticConcurrencyException.
  • You can handle OptimisticConcurrencyExceptions by calling Refresh with your RefreshMode of choice.
  • In high concurrency scenarios, call Refresh frequently. The RefreshMode controls how changes are propagated: StoreWins causes EF to overwrite all data in the cache with DB values. ClientWins replaces original values only in the cache with DB values.
  • Call Refresh after calling SaveChanges if your updates may modify data that belongs to other objects. For example, if you have a trigger that fires when you update a row in a table, calling Refresh after saving changes to that table is a good idea.
Binding Objects to Controls
  • (From http://social.msdn.microsoft.com/Forums/en/adonetefx/thread/c76d72c0-951c-4033-b75f-fc84f735826e): EntityCollection represents a collection of entities related to another entity (a many relationship). ObjectSet is not actually a collection type: it provides access to entities that belong to an entityset in the DB. LINQ on an EntityCollection is LINQ to Objects. LINQ on an ObjectSet uses LINQ to Entities (results in DB query).
  • To bind entities directly to a control, set the DataSource property of a control to an EntityCollection or to the ObjectResult you get when you call Execute on an ObjectSet or ObjectQuery. In WPF, you can set a DataContext to an EntityCollection or ObjectResult.
  • Don't bind directly to an ObjectQuery, bind to the result of Execute.
  • When working with LINQ, cast the result of your query to an ObjectQuery and you can call Execute on it.
  • To refresh the bound control, call Execute on the query again. This will bind the control to a new ObjectResult.
  • Calling OfType on an EntityCollection returns an IEnumerable, which cannot be bound to a control. Instead of OfType, use CreateSourceQuery to get the ObjectQuery that defines the base EntityCollection. You can then bind a control to the execution of the ObjectQuery that is returned by OfType on the ObjectQuery.
Serializing Objects
  • When serializing entities, if lazy loading is not disabled, it will be triggered, possibly causing your object graph to become very large.
  • Binary and WCF datacontract serialization serialize related objects. XML serialization does not.