Entity Framework Eager vs. Lazy Loading

I recently ran into an issue on NemeStats.com where an Entity Framework 6 LINQ query wasn’t fetching some related entities even though I had explicitly included them (via .Include(“EntityName”). As a result, I created a simple integration test project to demonstrate a number of basic Entity Framework common scenarios and the SQL that is emitted from these queries. My goals were as follows:

  1. Troubleshoot why in some cases related entities were being fetched properly and in other cases they were not. In particular, why is there a difference when selecting an entity as a property on an anonymous object vs. selecting the entity directly?
  2. Confirm my understanding of Lazy vs. Eager loading in Entity Framework. This ended up being related to #1
  3. Understand exactly when it is necessary to .Include() entities and when it is not.
  4. See what the actual SQL looks like that Entity Framework generates for some really commons scenarios.

The code is available on GitHub at: https://github.com/jakejgordon/Entity-Framework-Integration-Tests-and-Examples/blob/master/EF6Examples/Examples/Tests/LazyLoadingExamples.cs and is probably worth skimming through as I’ve commented each of the scenarios to show the SQL that was emitted.

For a little more context, I created a new database with the following three tables:

Person
Id – int – PK
Name – nvarchar

Pet
Id – int – PK
Name – nvarchar
OwningPersonId – int – FK to Person.Id
FavoritePetFoodBrandId – int – FK to PetFoodBrand.Id

PetFoodBrand
Id – int – PK
Name – nvarchar

Conclusion and Takeaways

If you don’t want to bother checking out the code and coming to your own conclusions, here are the things I learned in this experiment:

  • Lazy loading is on by default and can cover up some bad things if you aren’t careful. For example, pulling back a Person entity and then iterating over a list of Person.Pets will do a separate query for each Pet if you didn’t explicitly .Include(“Pet”). In many cases this is OK, but in many cases this could results in hundreds or thousands of additional queries that you may not have been expecting. Personally I prefer to turn off lazy loading so I don’t accidentally let one of these scenarios happen. I don’t see many cases in a web application where you would actually want to lazy load.
  • If you only need a single field from a table then use the LINQ query syntax to pull just that one field rather than loading the entire entity. For example, if you just want to fetch the Name of a Person you can do:


    string name = (from person in dbContext.Persons
    where person.Name == "Jane"
    select person.Name).First();

  • If you use the LINQ query syntax (i.e. from … where … select) and select an anonymous object then the included entities are NOT eagerly fetched on the projection… unless you fetch those relationships explicitly in the projection.

    This example will have Pets eagerly loaded:


    Person person = dbContext.Persons
    .Include("Pets")
    .First(p => p.Name == "Jane");

    This example will not have Pets eagerly loaded:


    Person person = (from person in dbContext.Persons
    .Include("Pets")
    where person.Name == "Jane"
    select new
    {
    Person = person
    }).First();

    This example will have Pets and FavoriteFoodBrands eagerly loaded because Pets and FavoriteFoodBrands are explicitly fetched in the project and will be populated into their according relationships of the current DbContext instance:


    var result = (from person in dbContext.Persons
    where person.Name == PERSON_JANE
    select new
    {
    Person = person,
    Pets = person.Pets,
    FavoriteFoodBrands = person.Pets.Select(p => p.FavoritePetFoodBrand)
    }).First();

    In other words, all of the below Asserts pass:
    Assert.That(result.Person, Is.Not.Null);
    Assert.That(result.Person.Pets, Is.Not.Null);
    Assert.That(result.Pets, Is.Not.Null);
    Assert.That(result.FavoriteFoodBrands, Is.Not.Null);
    Assert.That(result.Person.Pets.Select(pet => pet.FavoritePetFoodBrand).Count(), Is.GreaterThan(0));

  • Attempting to fetch an entity that was already fetched on a given DbContext instance will not bother going back to the database since it is already cached up on the DbContext. In my particular case, I didn’t notice a bug in my application because I happened to be fetching all of these entities on a different query earlier in the call stack. When I went to .Include(“MyRelatedEntity”) on a new query this didn’t actually do anything because I was selecting an anonymous object (see previous bullet) — but it happened to work anyway because all of MyRelatedEntity objects were already cached up on my DbContext. I only discovered the bug later when I attempted to include an entity that wasn’t already cached up and the result was always null.
  • It is only necessary to use .Include (e.g. .Include(“Pets”)) when you are fetching an entity and want that relationship to be pre-populated when you pull the entity back (e.g. you “select person” and want person.Pets to be populated). It is a common misconception that you need to Include related entities if you are selecting them explicitly or as part of an anonymous object. For example, if you want to eagerly (in one query) retrieve Jane and all of her Pets, use the .Include like this:


    Person person = dbContext.Persons
    .Include("Pets")
    .First(p => p.Name == "Jane");

    However, if you just want to retrieve Jane’s name and her pets’ names, you can just do this:


    var result = (from person in dbContext.Persons
    where person.Name == PERSON_JANE
    select new
    {
    PersonName = person.Name,
    PetNames = person.Pets.Select(pet => pet.Name)
    }).First();

    This comes especially in handy when you have tables with a large number of columns and only need to retrieve a small set of those columns. The SQL generated here will only select the needed columns.

Advertisements

6 comments

  1. Nice exploration. i love doing experiements like thist. Just to add a little “why” to your explanation of the behavior, Include will just never work with a projection even if the projection is returning an entity. When it tries to resolve theh Include, it only knows you are doing something outside of the model. It doesn’t look inside of the projected type to see if you happen to be returning any entities. Best resolution there (when you’re doing a real projection .. not just your exploratory example) is to just add the navigation property as one of the projection properties.

    WRT the ignored lazy loading, the change tracker keeps an “IsLoaded” property on each change tracking object for your entities. If you Include something, it’s IsLoaded will be true. Same for lazy or eager loaded data. Lazy Loading will check that value. If it’s “IsLoaded” then it won’t bother lazy loading the data “again” even if you want to refresh. Alternatively, if the related data is in memory by some other means (a straight query or a projection), IsLoaded won’t get changed to true. So even if you already haev the data, Lazy Loading will go get it again.

    HTH
    Great post.
    More explanation of some of the why’s on this behavior here: http://msdn.microsoft.com/en-us/magazine/d34d1b60-e980-4bd7-9e6d-847b9b73f597

    • Your post on the topic is excellent. I can’t believe I didn’t stumble upon this up to this point. If I google “entity framework eager vs. lazy loading” (which is what I searched in the past) your post doesn’t show up — but as far as I’m concerned your article is the authoritative explanation. Thanks for sharing.

  2. Your projection that didn’t ‘eager load’ pets was because you didn’t put it in your projection. For example, this would eager load the pets:

    Person person = (from person in dbContext.Persons
    .Include(“Pets”)
    where person.Name == “Jane”
    select new
    {
    Id = person.Id,
    Pets = person.Pets
    }).First();

  3. Ha! Shawn says in code what I said in words. Except I wouldn’t bother with the Include method. It doesn’t do anything in this case but at least it won’t throw. Should give the same results.

    Person person = (from person in dbContext.Persons
    where person.Name == “Jane”
    select new
    {
    Id = person.Id,
    Pets = person.Pets
    }).First();

    • Yes, this makes sense now. My original assumption (before testing this out) was that using dbContext.Persons
      .Include(“Pets”) would cause all Person entities to have their Pets populated, even if I was only fetching a Person in the projection. It looks like when you using projections you have to flatten it out. I just pushed up new code to add another relationship to Pet so you now have Pet.FavoriteFoodBrand so I could illustrate this in greater detail. I will update the blog post to show this example. This was news to me and a very interesting discovery. Thanks to both of you for pointing it out.

  4. Pingback: Eager Loading Of Related Entities in LLBLGen Pro | Musings of a Software Development Manager


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s