Compiled or Bust?

JasmineA · March 23, 2010, 12:00am

Well that’s what the article is saying you should do, compile the query and reuse it… but LINQ doesn’t do this for you, at application compile time. During the application runtime, LINQ does a JIT compile on every query, every time, unless you say otherwise. It seems to be a minor performance hit compared to the database shot anyway, but it’s interesting to know.

What worries me about LINQ in general is that it seems to promote lazy database design and usage, and this can hurt performance as well as stability.

Skippyfire · March 23, 2010, 12:00am

@Jasmine: I don’t think that LINQ or EF promote lazy database design at all! In fact, having a poorly designed database will probably cause more problems with these ORMs than if you were using straight SQL. Either way, your database should use primary keys, foreign keys, indexes, etc.

The benfits of the ORMs is that you don’t have to write the rudimentary database access code, and get to use LINQ queries right away.

Could you elaborate on how they promote lazy design?

Phaedrus30 · March 23, 2010, 12:00am

+1 anyone who commented that LINQ to SQL creates parameterized SQL queries that are re-used efficiently at the database.

Compiled queries aren’t very flexible because they don’t support dynamic composition, which is arguably one of the best features of LINQ (and LINQ to SQL as well). The shape of a compiled query is static, and you can only perform parameter substitution when it is executed - you can’t change its shape at runtime.

This is fine for your basic SingleOrDefault() or Where() queries, but
the vast majority of queries I need to write require some form of dynamic composition (add a filter when these parameters are specified, change the projection when this parameter is specified, etc).

While you could attempt to factor a dynamically composed query into multiple static compiled queries, I return to the maxim, if it ain’t broke, don’t fix it!

Compiling queries just for the sake of it is premature optimization at its worst, and shouldn’t be done without exacting proof from a profiler, and an incumbent performance problem.

MatsH · March 26, 2010, 12:00am

@Skippyfire

You are quite correct. A compiled query should be cached at the application level and be reused between requests.

Jeff’s statement “But then I realized it’s probably more realistic to make sure that any page we generate isn’t doing 500 freakin’ thousand queries!” can be misleading. I expect what Jeff means is that with few queries per page, and a small overhead per non-compiled query, the extra response time for that page won’t add up to much. Which makes sense.

But it is easy to misread what Jeff said to mean that he thinks compiled queries are only cached for the duration of a request, such that if you don’t have many queries per request, the compiled ones won’t be reused much and compilation might actually hurt performance. But this, of course, would be unrealistic, since the compiled queries would be cached at application level, and thus if you have multiple requests to the same page, calling on the same query, the compiled one will be used (except the first time the page is called) and performance will improve.

With 1.5 million hits per day on Stack Overflow, a compiled query cached at application level should find tremendous reuse and possibly shave a reasonable chunk of load from the server.

At the same time, each page might still load only marginally quicker, which is what Jeff observes. In other words, it is the type of optimization that may do more (in an already well balanced system) towards lowering the electricity bill for the server owner rather than make significant impact on response times for the end users.

Evan_Plaice · March 26, 2010, 12:00am

LOL. Looks like you did a sufficient job of inspiring one hell of a micro-optimization theatre in the comments with this thread.

I completely agree that LinqToSQL is very convenient and useful tool. Since I learned it, I have had only one instance where I’d use traditional SPROCS over it and that required an extensive read/process/write step to occur in no more than 100ms (obviously not web stuff).

If there is no better example, the comment thread of this post poses a perfect case where the word scalability has become the next gen ::cringe::paradigm::/cringe:: “web 2.0” of software development.

Meaning that, any jackass who can write a loop and store a date/time value will be partaking in creating comprehensive statistical explanations of why solution x is faster than solution y. Welcome back to bike shed painting 101. Disclaimer: I can’t claim complete innocence, I have painted that bikeshed my fair share of times too.

I think, the real concept of scalability represents 2 things.

Raw performance
I if the number of executions is increased at a linear rate is the processing time growth linear, exponential, or lograrithmic. I’d consider logarithmic growth = scalable and optimized, linear growth = scalable, and exponential growth = not scalable.
Ex. If your site grows 10x in popularity is it going to need 5x, 10x, or 100x the servers to keep up with the demand?
Application Domain
You put the difference into perspective perfectly. “Let’s do the math, even though I suck at it. Each uncompiled run of the query took less than one third of a millisecond longer.” Once again, you prove that you’re smarter than the average code monkey and that’s why I like your blog. I can actually feel myself not grow dumber the longer I read your material.

Unfortunately, scaling has hit the “scene” and all the code monkeys have their panties in a bunch. There’s masses of really bad/incorrect examples going around about optimization. It’s gonna take a colossal amount of panty un-twisting to fix it.

For websites specifically. I’d specify the domain range as 7 seconds. You have 7 seconds to load everything before the tip of the ADHD afflicted masses start to flee in troves, followed by average people (15 sec), and finally the brutally patient (whatever the timout rate is). If your page can’t load in less than 7 seconds you’re doing it wrong. There are much better optimizations that can occur in this scope (eliminating file requests from the server, or from multiple servers::cringe: than trimming off a few ms from a DB query.

My .02 on perf.

@ Ric Johnson
Although most people don’t know. Linq is about more than SQL ORM. Linq can be performed on XML, Collections, etc… For anybody that has used Linq extensively, it’s pretty obvious that the functionality doesn’t do a good job of supporting changes in the database model.
I think they’re also leading away from LinqToSql because it only supports MsSql. What about all of the other modeling systems out there? LinqToOracle, LinqToMySql, LinqToPostgre, etc… The name and application itself isn’t general enough to cover all the systems that people will eventually expect of it. Neither Linq or LinqToSql are going anywhere. Linq will be around forever, and LinqToSql will sit quietly in the .Net framework and do what it does best. I think their development emphasis will just be directed more toward Entities for the ORM part and LinqToEntities for the querying part.

VdomainsH · March 30, 2010, 12:00am

Question for the masses: in the last decade there has been so much talk about the N-tier approach and separating presentation, business, data access and data. So, how does using LinqToSql fit into this?

I love the idea of using the SQL db just for storage. So, without using LinqToSql as “inline SQL” (meaning in the presentation layer), are you suggesting we keep all LinkToSQL in the data access layer?

PS. great post!

Jon_Raynor · April 19, 2010, 12:00am

The whole point here is the LINQ to SQL compilation time is trivial. Its only a few extra ms per call, and that is happening on the web server, not the database server.

So, spending time compiling those queries will only give a slight boost in performance and it is not worth doing if your only gaining a few extra ms.

I’d be more worried about the SQL that the LINQ compiler is generating. After all, the database runs SQL statements not LINQ ones. A little tougher to debug since the actual SQL is abstrated out by LINQ. But you can still trace things out and see what is going on in SQL land.

Anyway, you want to avoid this, write a stored proc, open a connection and run the proc with a command. Process the results with a data reader. Then there is 0 compilation time with LINQ.

rowan_shaun · June 14, 2014, 12:44am

Hmm… I hate to revive such an old topic, but this post shows up very high in search results involving LinqToSql and compiled queries.

The problem with the example posted is that it uses a **super simple expression tree ** which takes almost no time to translate into SQL. Of course you’re not going to see performance gains in this scenario… It is the equivalent of dismissing databases indexes because you don’t notice any improvement in your sample table with 10 rows!

Try creating a more realistic example that a mid/large size application might be using which uses a lot more joins. Expression tree compilation to SQL takes significantly more time as it increases in complexity (based on my experience, it does not seem to be linear).

In many of queries we see speedups anywhere from 10 to 1000x the performance by precompiling.

Of course we should never prematurely optimize, but I don’t think it takes very much complexity before linq compilation becomes the bottleneck. And it has nothing to do with the number of users or quantity of data - the thing that matters is the complexity of the expression tree which makes up your query.