Entity Framework Eager vs. Lazy Loading

I recently ran into an issue on NemeStats.com where an Entity Framework 6 LINQ query wasn’t fetching some related entities even though I had explicitly included them (via .Include(“EntityName”). As a result, I created a simple integration test project to demonstrate a number of basic Entity Framework common scenarios and the SQL that is emitted from these queries. My goals were as follows:

  1. Troubleshoot why in some cases related entities were being fetched properly and in other cases they were not. In particular, why is there a difference when selecting an entity as a property on an anonymous object vs. selecting the entity directly?
  2. Confirm my understanding of Lazy vs. Eager loading in Entity Framework. This ended up being related to #1
  3. Understand exactly when it is necessary to .Include() entities and when it is not.
  4. See what the actual SQL looks like that Entity Framework generates for some really commons scenarios.

The code is available on GitHub at: https://github.com/jakejgordon/Entity-Framework-Integration-Tests-and-Examples/blob/master/EF6Examples/Examples/Tests/LazyLoadingExamples.cs and is probably worth skimming through as I’ve commented each of the scenarios to show the SQL that was emitted.

For a little more context, I created a new database with the following three tables:

Person
Id – int – PK
Name – nvarchar

Pet
Id – int – PK
Name – nvarchar
OwningPersonId – int – FK to Person.Id
FavoritePetFoodBrandId – int – FK to PetFoodBrand.Id

PetFoodBrand
Id – int – PK
Name – nvarchar

Conclusion and Takeaways

If you don’t want to bother checking out the code and coming to your own conclusions, here are the things I learned in this experiment:

  • Lazy loading is on by default and can cover up some bad things if you aren’t careful. For example, pulling back a Person entity and then iterating over a list of Person.Pets will do a separate query for each Pet if you didn’t explicitly .Include(“Pet”). In many cases this is OK, but in many cases this could results in hundreds or thousands of additional queries that you may not have been expecting. Personally I prefer to turn off lazy loading so I don’t accidentally let one of these scenarios happen. I don’t see many cases in a web application where you would actually want to lazy load.
  • If you only need a single field from a table then use the LINQ query syntax to pull just that one field rather than loading the entire entity. For example, if you just want to fetch the Name of a Person you can do:


    string name = (from person in dbContext.Persons
    where person.Name == "Jane"
    select person.Name).First();

  • If you use the LINQ query syntax (i.e. from … where … select) and select an anonymous object then the included entities are NOT eagerly fetched on the projection… unless you fetch those relationships explicitly in the projection.

    This example will have Pets eagerly loaded:


    Person person = dbContext.Persons
    .Include("Pets")
    .First(p => p.Name == "Jane");

    This example will not have Pets eagerly loaded:


    Person person = (from person in dbContext.Persons
    .Include("Pets")
    where person.Name == "Jane"
    select new
    {
    Person = person
    }).First();

    This example will have Pets and FavoriteFoodBrands eagerly loaded because Pets and FavoriteFoodBrands are explicitly fetched in the project and will be populated into their according relationships of the current DbContext instance:


    var result = (from person in dbContext.Persons
    where person.Name == PERSON_JANE
    select new
    {
    Person = person,
    Pets = person.Pets,
    FavoriteFoodBrands = person.Pets.Select(p => p.FavoritePetFoodBrand)
    }).First();

    In other words, all of the below Asserts pass:
    Assert.That(result.Person, Is.Not.Null);
    Assert.That(result.Person.Pets, Is.Not.Null);
    Assert.That(result.Pets, Is.Not.Null);
    Assert.That(result.FavoriteFoodBrands, Is.Not.Null);
    Assert.That(result.Person.Pets.Select(pet => pet.FavoritePetFoodBrand).Count(), Is.GreaterThan(0));

  • Attempting to fetch an entity that was already fetched on a given DbContext instance will not bother going back to the database since it is already cached up on the DbContext. In my particular case, I didn’t notice a bug in my application because I happened to be fetching all of these entities on a different query earlier in the call stack. When I went to .Include(“MyRelatedEntity”) on a new query this didn’t actually do anything because I was selecting an anonymous object (see previous bullet) — but it happened to work anyway because all of MyRelatedEntity objects were already cached up on my DbContext. I only discovered the bug later when I attempted to include an entity that wasn’t already cached up and the result was always null.
  • It is only necessary to use .Include (e.g. .Include(“Pets”)) when you are fetching an entity and want that relationship to be pre-populated when you pull the entity back (e.g. you “select person” and want person.Pets to be populated). It is a common misconception that you need to Include related entities if you are selecting them explicitly or as part of an anonymous object. For example, if you want to eagerly (in one query) retrieve Jane and all of her Pets, use the .Include like this:


    Person person = dbContext.Persons
    .Include("Pets")
    .First(p => p.Name == "Jane");

    However, if you just want to retrieve Jane’s name and her pets’ names, you can just do this:


    var result = (from person in dbContext.Persons
    where person.Name == PERSON_JANE
    select new
    {
    PersonName = person.Name,
    PetNames = person.Pets.Select(pet => pet.Name)
    }).First();

    This comes especially in handy when you have tables with a large number of columns and only need to retrieve a small set of those columns. The SQL generated here will only select the needed columns.

Advertisements