If you want to write optimal queries and commands on the database using Entity Framework Core, you absolutely must know its basic mechanisms. Entity Framework Core makes it very easy for a programmer to create queries, but if you do not follow the basic rules, your queries may simply be slow, and this in turn will scare your customers away. In this article, I will present you with 5 most common beginner mistakes that you should watch out for.
Project overview
Today we will also work on the Bookstore project, which we have already worked on in previous materials (Introduction to Entity Framework Core and 5 Best Practices with Entity Framework Core). This project already has the required packages for Entity Framework Core installed.
Context:
using Microsoft.EntityFrameworkCore;
using Microsoft.Extensions.Configuration;
using Microsoft.Extensions.Logging;
using NLog.Extensions.Logging;
using System;
namespace Bookstore
{
class ApplicationDbContext : DbContext
{
public static readonly ILoggerFactory _loggerFactory = new NLogLoggerFactory();
public DbSet<Book> Books { get; set; }
public DbSet<Category> Categories { get; set; }
protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
{
var builder = new ConfigurationBuilder()
.AddJsonFile($"appsettings.json", true, true);
var config = builder.Build();
optionsBuilder
.UseLoggerFactory(_loggerFactory)
.LogTo(Console.WriteLine, new[] { DbLoggerCategory.Database.Command.Name }, LogLevel.Information)
.EnableSensitiveDataLogging()
.UseSqlServer(config["ConnectionString"]);
}
}
}
Entities:
namespace Bookstore
{
public class Book
{
public int Id { get; set; }
public string Name { get; set; }
public decimal Price { get; set; }
public int? CategoryId { get; set; }
public Category Category { get; set; }
}
}
using System.Collections.Generic;
namespace Bookstore
{
public class Category
{
public int Id { get; set; }
public string Name { get; set; }
public ICollection<Book> Books { get; set; } = new HashSet<Book>();
}
}
Console application main class:
using Microsoft.EntityFrameworkCore;
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Threading.Tasks;
namespace Bookstore
{
class Program
{
static async Task Main(string[] args)
{
var stopwatch = new Stopwatch();
var books = new List<Book>();
using (var context = new ApplicationDbContext())
{
stopwatch.Start();
books = await context.Books.ToListAsync();
stopwatch.Stop();
}
Console.WriteLine($"Query execution time: {stopwatch.ElapsedMilliseconds} MS.");
}
}
}
1. N+1 Problem
First, I wanted to show you the N+1 Problem, which is connected to Lazy Loading. By default, Lazy Loading is disabled in Entity Framework Core, so to enable it, we need to make a few changes to our application.
But before that, I wanted to show you how loading related elements works. Now we will get all our books from the database and later we will want to display information about their categories. In this case, an exception will be thrown in our code, because our categories will be null.
static async Task Main(string[] args)
{
var stopwatch = new Stopwatch();
var books = new List<Book>();
using (var context = new ApplicationDbContext())
{
stopwatch.Start();
books = await context.Books.ToListAsync();
foreach (var item in books)
{
Console.WriteLine(item.Category.Name);//wyjątek
}
stopwatch.Stop();
}
Console.WriteLine($"Query execution time: {stopwatch.ElapsedMilliseconds} MS.");
}
Currently, in the database, each book has 1 category assigned to it, but if we want to refer to the category this way, an exception will be thrown because our query does not retrieve any information about related objects.
If we now enable the Lazy Loading mechanism, our code will work correctly, but we will have the N+1 Problem. First, to show you how the Lazy Loading mechanism works, we need to enable it. To do this, we need to install a package called: Microsoft.EntityFrameworkCore.Proxies via NuGet.
This is the first step. Next, in the ApplicationDbContext class in our configuration, we need to call the UseLazyLoadingProxies() method.
optionsBuilder
.UseLazyLoadingProxies()
.UseLoggerFactory(_loggerFactory)
.LogTo(Console.WriteLine, new[] { DbLoggerCategory.Database.Command.Name }, LogLevel.Information)
.EnableSensitiveDataLogging()
.UseSqlServer(config["ConnectionString"]);
And also mark all navigation properties with the keyword virtual. So the entity code will look like this:
namespace Bookstore
{
public class Book
{
public int Id { get; set; }
public string Name { get; set; }
public decimal Price { get; set; }
public int? CategoryId { get; set; }
public virtual Category Category { get; set; }
}
}
using System.Collections.Generic;
namespace Bookstore
{
public class Category
{
public int Id { get; set; }
public string Name { get; set; }
public virtual ICollection<Book> Books { get; set; } = new HashSet<Book>();
}
}
If we now run our application after enabling the Lazy Loading mechanism, you will notice that all categories will be displayed correctly.
However, as you can see, we have a problem here. Because we log all queries in the console, you can see that every time we want to display the category name, an additional select is performed on the database, which retrieves information about the category from the database, in a situation where it has not been loaded before. So we first have a query for all books and this is the query that will be called here:
books = await context.Books.ToListAsync();
And additionally we have n queries in each iteration in the loop for a new category.
foreach (var item in books)
{
Console.WriteLine(item.Category.Name);
}
So if we want to display additional information about a category here, an additional query is performed and this is the N+1 Problem. Every time we need to display a new category, an additional query is performed, so we will have n queries, that is, as many as there are iterations plus the first query for all books.
So how can we improve this query now? Instead of the Lazy Loading mechanism, we can use the Eager Loading mechanism here and get all the categories in one query. Just add a call to the Include() method in our first query and include all the categories at once.
books = await context.Books
.Include(x => x.Category)
.ToListAsync();
If you run the app now, you'll notice that only 1 query was executed. In fact, a left join was executed right away, which retrieved category information for each book. Thanks to this, in the loop below, there was no need to execute a new query each time.
Of course, the Lazy Loading mechanism can be useful in some cases, but not in this particular case. At this point, I will uninstall the package that we have just installed Microsoft.EntityFrameworkCore.Proxies. I will remove the UseLazyLoading() method call in the ApplicationDbContext configuration method and the keywords in the Book and Category entities. Fortunately, in Entity Framework Core, the Lazy Loading mechanism is blocked by default.
2. SQL Injection
The second common and very dangerous problem is SQL Injection, which, if you are not familiar with Entity Framework Core, you may sometimes have a problem with. By writing regular queries using the method syntax (as we have in our case), we are protected against SQL Injection attacks, but in Entity Framework Core we can also write our own SQLs, and here we have to be careful.
Now let's analyze another query. Instead of downloading books using the method syntax, that is:
books = await context.Books
.ToListAsync();
We will download all books from the database using pure SQL.
var name = "Book 0";
books = await context.Books
.FromSqlRaw(
$"SELECT * FROM Books WHERE Name='{name}'")
.ToListAsync();
Let's assume we have a variable name and this is the value passed to the query by the user. Usually it looks like this, we have some text field for filtering data and there the user can enter the name of the book that interests them and we later directly pass the value of this variable to our query. Below we will display the names of the books.
foreach (var item in books)
{
Console.WriteLine(item.Name);
}
Now run the application and see the result.
Everything seems to be fine, the appropriate record was returned, the query also seems to be fine, but look what is passed as the name. The value of our name variable is being pasted directly there. This means that if the user wants to conduct an SQL Injection attack, all they have to do is provide a more customized value, which will be passed to the query without any protection. All the user has to do is provide the following value:
var name = "Book 0' OR ''='";
And in fact, such a query will be executed:
SELECT * FROM Books WHERE Name='Book 0' OR ''=''
Which will always be true, therefore all records from the Books table will be returned.
Notice that this is what actually happened. The value entered by the user was pasted into the query and all records were returned from the database. Of course, such attacks can be very dangerous and we must be careful about them, because we can make all data available to unauthorized persons.
So how should this query look like? You can write it in two ways. In our case, instead of the FromSqlRaw() method, we can use the FromSqlInterpolated() method. We can then remove the apostrophes in the query, because we are using string interpolation and if we run the application now, notice that no record was returned.
var name = "Book 0' OR ''='";
books = await context.Books
.FromSqlInterpolated(
$"SELECT * FROM Books WHERE Name={name}")
.ToListAsync();
The parameter has been secured and has been correctly passed to the query. If we use FromSqlInterpolated() in the query, then we can definitely build the query by interpolating strings and then we are protected from SQL Injection attacks. If the correct value is passed, for example Book 0, then the 1 record we are looking for will be correctly returned.
var name = "Book 0";
The second way is to use FromSqlRaw() and combine it with string format (and not with string interpolation as before).
var name = "Book 0' OR ''='";
books = await context.Books
.FromSqlRaw(
"SELECT * FROM Books WHERE Name={0}", name)
.ToListAsync();
As you can see, this time the parameter was passed correctly and no record was returned from the database. However, if I had passed the name of the book correctly, the appropriate record would have been returned.
var name = "Book 0";
These are 2 good ways to pass parameters to raw SQLs in Entity Framework Core. Also remember not to directly pass all previously built SQL to the FromSqlRaw() method.
var name = "Book 0' OR ''='";
var sql = $"SELECT * FROM Books WHERE Name='{name}'";
books = await context.Books
.FromSqlRaw(sql)
.ToListAsync();
In this case, the parameters will also be passed without any protection to the query and SQL Injection attacks will be possible.
3. Ignorance of IQueryable and IEnumerable
The third mistake often made by beginners is the lack of knowledge of IQueryable and IEnumerable, which causes slower queries. If you want to use data filtering, then in this case you should rely on the IQueryable interface, which will apply all filters on the server side. If you use the IEnumerable interface, then the added filters will be performed on the collection in memory. This means that first all records will be downloaded from the database, and the filters will be performed on the entire collection in memory.
See this example. Let's assume that we want to download all books with a category with an Id of 1, so now we have 1 such record in the database. We can do this in several ways, i.e. we can assign the result to a variable of the IQueryable type:
var categoryId = 1;
IQueryable<Book> books = context.Books;)
Thanks to this, at this point this query will not be executed on the database yet, it will be executed only when these books are first referenced. So if I want to add another filtering. For example, I would like that in case the user selects a category, this filter will be included in the query.
var categoryId = 1;
IQueryable<Book> books = context.Books;
if (categoryId > 0)
books = books.Where(x => x.CategoryId == categoryId);
foreach (var item in books)
{
Console.WriteLine(item.Name);
}
So here we operate on the IQueryable interface, thanks to which all filters that are applied before referring to these books will be executed on the database, only once. See how this query looks in the console after starting the application.
Only 1 query was executed on the database already with the filter applied. If there were more filters, then all of them would be applied to the first query.
However, if instead of the IQueryable interface we were working on the IEnumerable interface, then in such a case all records would be retrieved from the database first, and only then the remaining filters would be applied and executed on the entire collection in memory.
var categoryId = 1;
IEnumerable<Book> books = context.Books;
if (categoryId > 0)
books = books.Where(x => x.CategoryId == categoryId);
foreach (var item in books)
{
Console.WriteLine(item.Name);
}
As you can see, 1 query was executed on the database without any filters, and only later the filters were applied to the collection in memory (which is why we do not see it in the console) and only 1 correct record was returned.
To sum up, if we want to build filtering in this way, i.e. spread over more lines of code, then it is worth using the IQueryable interface and filtering all the data on the server side.
If you were to execute ToListAsync() in our query, then this query will also be executed at this point on the database and any subsequent filters will be executed on the collection in memory.
var categoryId = 1;
var books = await context.Books.ToListAsync();
if (categoryId > 0)
books = books.Where(x => x.CategoryId == categoryId).ToList();
foreach (var item in books)
{
Console.WriteLine(item.Name);
}
Which does not change the fact that if we add all the filters on the first query and even use IEnumerable or List, then in such a case, of course, all the filters will be applied correctly.
var categoryId = 1;
var books = await context.Books
.Where(x => x.CategoryId == categoryId)
.ToListAsync();
foreach (var item in books)
{
Console.WriteLine(item.Name);
}
The problem only appears when these filters are broken down into subsequent statements.
4. Not using transactions when necessary
The fourth common mistake is not using transactions when necessary. Imagine a situation where you want to add an order with items. Let's say that such an order without items makes no sense. What if the order header is added, and when trying to add the item an exception is thrown? Then it will be an order without items. To protect yourself from such a situation, you should add all this data in a single transaction. If an exception is thrown when adding the item, then no changes will be saved to the database, and this is a good solution.
Let's see how this might look in our case. Let's try to add a category first, then save the changes to the database. Then let's add a book and save the changes to the database again.
using (var context = new ApplicationDbContext())
{
stopwatch.Start();
var categoria = new Category { Name = "Kat. X" };
context.Categories.Add(categoria);
await context.SaveChangesAsync();
var book = new Book { Name = "K. X", CategoryId = categoria.Id, Price = 999 };
context.Books.Add(book);
await context.SaveChangesAsync();
stopwatch.Stop();
}
Let's assume that now we are interested in a situation where a category should be added only if a book is also added. However, we have 2 separate transactions here.
This is the first transaction:
context.Categories.Add(categoria);
await context.SaveChangesAsync();
And this is the second transaction:
context.Books.Add(book);
await context.SaveChangesAsync();
So if an exception occurs in our second transaction, the first transaction will still be saved in the database.
We can check this by adding an artificial exception in the second transaction.
using (var context = new ApplicationDbContext())
{
stopwatch.Start();
var categoria = new Category { Name = "Kat. X" };
context.Categories.Add(categoria);
await context.SaveChangesAsync();
var book = new Book { Name = "K. X", CategoryId = categoria.Id, Price = 999 };
context.Books.Add(book);
throw new Exception("invalid data");
await context.SaveChangesAsync();
stopwatch.Stop();
}
As you can see, the category was added to the database, and the book (due to an exception being thrown) was not added to the database, which is unfortunately the situation we wanted to protect against.
If we want all changes to be saved, or none at all, then we have to execute the entire code in 1 transaction. Also, the SaveChangesAsync() method should be called only once after all changes have been made to the database.
using (var context = new ApplicationDbContext())
{
stopwatch.Start();
var categoria = new Category { Name = "Kat. X" };
context.Categories.Add(categoria);
var book = new Book { Name = "K. X", Category = categoria, Price = 999 };
context.Books.Add(book);
throw new Exception("invalid data");
await context.SaveChangesAsync();
stopwatch.Stop();
}
If we now run our application, this artificial error will be thrown and no changes will be saved to the database, and that's exactly what we wanted.
Only if we remove the exception, all changes will be saved to the database. The category and book will be added.
using (var context = new ApplicationDbContext())
{
stopwatch.Start();
var categoria = new Category { Name = "Kat. X" };
context.Categories.Add(categoria);
var book = new Book { Name = "K. X", Category = categoria, Price = 999 };
context.Books.Add(book);
await context.SaveChangesAsync();
stopwatch.Stop();
}
In summary, if we want to be protected against such a situation, i.e. we need to add both the category and the book, we are not interested in a scenario where only the category itself is added during an exception, then we have to do the whole thing in 1 transaction.
5. Not using data projection
The fifth and last beginner's mistake that I wanted to draw attention to in this episode is downloading unnecessary data, i.e. not using data projection.
If we write such a query, it will download all columns from the books table.
using (var context = new ApplicationDbContext())
{
stopwatch.Start();
var books = await context.Books.ToListAsync();
stopwatch.Stop();
}
In some situations, of course, such a query will be correct, because we may need all of this data, but if we only need, for example, the names of these books, it is worth using data projection in such cases, thanks to which this query will be significantly faster, especially if there is a lot of this data.
So, to retrieve the name of these books alone, it is enough to call the Select() method and indicate that we are only interested in the column with the name of these books.
using (var context = new ApplicationDbContext())
{
stopwatch.Start();
var books = await context.Books
.Select(x => x.Name)
.ToListAsync();
stopwatch.Stop();
}
Notice that now a Select has been performed on the database, which only retrieves the name of these books, thanks to which this query will be much faster than if we retrieved all the columns from this table. It is also always worth retrieving only the data that we actually need in a given case, and not forcing everything from the database.
Entity Framework Core School
If you would like to learn Entity Framework Core from the inside, how to create fast applications and efficient database queries, consider joining the Entity Framework Core School - [here].
SUMMARY
That's all the most common mistakes made by beginners related to Entity Framework Core. Pay attention to them and your queries will definitely be better then. Remember above all:
- The N+1 problem.
- SQL Injection.
- Differentiating between IQueryable and IEnumerable interfaces.
- Using transactions when needed.
- Retrieving only the data you actually need.
If you liked this article, be sure to join my community. Sign up for the free newsletter, where every week I share valuable materials, especially regarding C# and the .NET platform (free subscription - newsletter).