I've broken this down into the simplest possible case where I can see the problem occurring. Three tables, with two successive 1-to-many relationships, as shown in the image below. Entities are exactly as you would expect from the model.
http://i.imgur.com/93OKJ.png
I use the following code to pull out all the appropriate data for 'newFoo', with an ID equal to 1.
ObjectQuery<newFoo> myQuery = context.CreateObjectSet<newFoo>().Include("newBars.newBazs");
IQueryable myFilteredQuery = myQuery.Where(c => c.FooID == 1);
List<newFoo> myQueryList = myFilteredQuery.ToList();
Now, I ran a SQL profiler to get the generated SQLquery, and came up with the following:
SELECT [Project1].[FooID] AS [FooID], [Project1].[C2] AS [C1], [Project1].[BarID] AS [BarID], [Project1].[FooLinkID] AS [FooLinkID], [Project1].[C1] AS [C2], [Project1].[BazID] AS [BazID], [Project1].[BarLinkID] AS [BarLinkID] FROM ( SELECT [Extent1].[FooID] AS [FooID], [Join1].[BarID] AS [BarID], [Join1].[FooLinkID] AS [FooLinkID], [Join1].[BazID] AS [BazID], [Join1].[BarLinkID] AS [BarLinkID], CASE WHEN ([Join1].[BarID] IS NULL) THEN CAST(NULL AS int) WHEN ([Join1].[BazID] IS NULL) THEN CAST(NULL AS int) ELSE 1 END AS [C1], CASE WHEN ([Join1].[BarID] IS NULL) THEN CAST(NULL AS int) ELSE 1 END AS [C2] FROM [dbo].[newFoo] AS [Extent1] LEFT OUTER JOIN (SELECT [Extent2].[BarID] AS [BarID], [Extent2].[FooLinkID] AS [FooLinkID], [Extent3].[BazID] AS [BazID], [Extent3].[BarLinkID] AS [BarLinkID] FROM [dbo].[newBar] AS [Extent2] LEFT OUTER JOIN [dbo].[newBaz] AS [Extent3] ON [Extent2].[BarID] = [Extent3].[BarLinkID] ) AS [Join1] ON [Extent1].[FooID] = [Join1].[FooLinkID] WHERE 1 = [Extent1].[FooID] ) AS [Project1] ORDER BY [Project1].[FooID] ASC, [Project1].[C2] ASC, [Project1].[BarID] ASC, [Project1].[C1] ASC
The important bit is the following snippit:
LEFT OUTER JOIN (SELECT [Extent2].[BarID] AS [BarID], [Extent2].[FooLinkID] AS [FooLinkID], [Extent3].[BazID] AS [BazID], [Extent3].[BarLinkID] AS [BarLinkID] FROM [dbo].[newBar] AS [Extent2] LEFT OUTER JOIN [dbo].[newBaz] AS [Extent3] ON [Extent2].[BarID] = [Extent3].[BarLinkID] ) AS [Join1] ON [Extent1].[FooID] = [Join1].[FooLinkID] WHERE 1 = [Extent1].[FooID]
This query causes the entire newBar to newBaz join to be evaluated, and THEN the appropriate data to be pulled out. If the tables are of any reasonable size, this can cripple the performance of the query, even if the final amount of data being returned is small.
For example consider newFoo with 50k rows, newBar with 250k and newBaz with 1000k. Given average distribution, selecting all related data for a single 'newFoo' row should return 20-25 rows of data - but a 250k row table is being joined to a 1000k row table
needlessly every time to accomplish this!
My gut feeling is that I'm doing something wrong - but I've tried I can think of to make it not be so. Is this behaviour intended?