Increase performance on query with a lot of one-to-many mappings

dapper entity-framework petapoco sql-server

Question

I'm using Entity Framework to SQL Azure and in one page of my application we show quite a bit of related data to the user. We're loading a max of 30 items on the page but each item has 5 one-to-many mappings to other objects. The query time is at a reasonable level but I'm leaking quite a bit of performance on the object mapping. (almost a full second).

Here's an example of what my object looks like

public class Task
{
    public string Name {get; set;}
    public string Status {get; set;}
    public DateTime DueDate {get; set;}
    public IEnumerable<TaskData> Data {get; set;}
    public IEnumerable<Transaction> Transactions {get; set;}
    public IEnumerable<File> Files {get; set;}
    public IEnumerable<Comment> Comments {get; set;}
    public IEnumerable<People> People {get; set;}
}

A task has a name, a status and a due date. It also has many TaskData that are custom name/value pairs, many Transactions that show a history of the task, many Files, many Comments and many People working on it.

My EF query looks something like this.

var Items = context.Items.Include(x=>x.Data).Include(x=>x.Files).Include(x=>x.Comments).Include(x=>x.People).Where(some constraint).ToList();

The relevance of a specific task is based first on the status, and then on the due date. So I've created an IComparable override to use with sort. The point is that paged queries don't work well in this scenario because the sort isn't based off of an int or a date (am I right?)

In the rest of our application we're displaying less information about each task and Linq2Entities is working just fine. The object mapping in this case is killing us though. I've gone down the road of going straight to the DB with Dapper but one-to-many mapping has it's caveats. For a few relationships I think it would work well but not for 5-6. My next thing to look at was PetaPoco but I didn't get very far before I thought I'd better throw the question on here first.

Am I crazy for trying to bring back so much data? What are my options for getting maximum performance out of this? I'll take a little bit of complexity since its only one area of the application.

Accepted Answer

I am willing to bet your EF query is pulling back too much data. The thing is, the "optimal" retrieval technique heavily depends on the type and amount of data being pulled.

Knowing that up front allows you to tune the queries you run based on your expected data set.

For example ... if you are only pulling a limited number of entities with lots of subentities the pattern I wrote here works well:

How do I map lists of nested objects with Dapper

If you know what ids you are pulling and there are less than 2000, you can shortcut it all by querying a single grid and mapping using QueryMultiple eg:

cnn.QueryMultiple(@"select * from Tasks where Id in @ids 
select * from Files where TaskId in @ids
.. etc ..", new {ids = new int[] {1,2,3}});

If you are yanking a larger set you may need to batch, or do so in phases.


For your particular example I would query Tasks to get all the task ids and data, then map on the relations using a single QueryMultiple to all the associated tables.




Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Is this KB legal? Yes, learn why