Compiled or Bust?

Heath_Noble · March 19, 2010, 12:00am

I appreciate your point that going back and re tooling now is a waste of time. I agree. But um, what’s so hard SQL in the first place?

NishitS · March 20, 2010, 12:00am

I really appreciate your point on the queries.

toxik · March 20, 2010, 12:00am

Welcome to the debate in using Python versus C#. Yes, there are other things people argue about the two languages (like static typing and so on).

In my experience, what anyone who argues in favor of C# will boil down to is “C# is faster”, and yes, it is. But when the whole stack is benchmarked, you’ll notice that a C# application versus a Python application gives response times in that same order of magnitude: a third of a millisecond or so.

RobW · March 20, 2010, 12:00am

@Shachart - Not true. SQL Server 2000 and later will perform auto-parameterization of queries in an attempt to reuse execution plans. (That being said, you really should be using explicit parameters wherever possible.)

See here: http://msdn.microsoft.com/en-us/library/aa175264(SQL.80).aspx

and here: http://msdn.microsoft.com/en-us/library/cc966425.aspx

sayap · March 20, 2010, 12:00am

All subsequent runs of that same query use the cached query plan and skip the query evaluation. Not so in Linq to SQL. As Rico said, every single run of the Linq query is fully parsed every time it happens.

No, what Rico said has nothing to do with cached query plan. He said that in every single run, the Linq query expression has to be parsed and converted to SQL. This is all happening in the app, whereas the query plan resides in the db.

Like Shachart said, whether or not a cached query plan will be used is database dependant.

Also, when you use ORM, you always do “5000 batches of 1 selects”. No sane person will use ORM to do “1 batches of 5000 selects”. That’s why Rico concluded his blog post with this statement:

“In short, if you expect to reuse the query at all, there is no performance related reason not to compile it.”

RaymondC · March 20, 2010, 12:00am

It is totally ununderstandable. Is this about converting a linq query into a sql query in the client/middle tier? Or is it about a RAM hash table in the database server with the execution plans of already executed queries?

So can anyone address MehrdadA’s remarks and questions?

The hash table with the execution plans is in Oracle a shared resource (all db connection use the same hash table), and because it is shared it is protected with spin locks. When there are many connections those spin locks will become a bottleneck because too many connections want to mutate the hash table and have to wait for a lock to continue.

This means that a proper performance benchmark has to be multithreaded benchmark if you want to load test an OLTP application! Has Jeff done a single threaded test or a multi threaded test?

So it is better to use parameterized queries in Oracle. Caching sql statements that differ in literals is also possible but you have to set a certain parameter. After setting this parameter, the oracle server will replace all literals with parameters and cache the execution plans.

However using parameterized queries is the preffered way in an Oracle based OLTP application.

Does Linq2Sql produce queries with parameters or litarals? Do you have to compile those queries if you want parameterized queries?

It is also very important to use parameterized queries if you want a good performance when you use Sqlite. Sqlite will become two or three times slower if you don’t use parameterized queries, you can’t call that a micro optimization. Sqlite doesn’t have a setting to chance this
behaviour.

John_Rusk · March 20, 2010, 12:00am

Surely the big lesson here is that instead of blindly following the advice of Rico Mariani (or Jeff Atwood) we should test to see whether compilation makes a meaningful difference in_our_particular_application.

Jeff is conducting an implicit test, by leaving his queries uncompiled and seeing how that turns out. It’s turned out fine in his case. I conducted the same implicit test on a recent project and it wasn’t fine. Performance was particularly bad when using a very broad inheritance hierarchy of L2S entities on the 32-bit version of the CLR.

@Jeff: By the way, your comparison may not be entirely valid. You wrote: “Then we compared it with the SQL variant; note that this is also being auto-cast down to the handy PostTag object as well, so the only difference is whether or not the query itself is SQL.” The point to note is that materializing the objects (i…e. constructing your PostTag objects) is something that is greatly improved by compilation. In other words, L2S compilation does not just affect running the SQL, it also affects instantiating the result set objects. (Rico covered this in one of his posts IIRC) I believe a more realistic test is to run two versions of a linq query - one compiled and one not. (Leave ExecuteQuery out of the picture totally). When I run such a test, the compiled version takes about 25% of the time of the uncompiled version. That’s for simple queries. When hitting a table that contains a very broad inheritance heirarchy, under the 32-bit version of the CLR, the performance differences is orders of magnitude.

VinceP · March 21, 2010, 12:00am

First of all, Rico’s article is 3 years old. The one I read anyway.

It’s not true that all LINQ to SQL queries are evaluated by the database. All LINQ queries generate parameterized SQL queries meaning the execution plan is evaluated ONCE and cached on the server.

I don’t know where Rico got his information from.

jdege · March 21, 2010, 12:00am

@John Rusk: Surely the big lesson here is that instead of blindly following the advice of Rico Mariani (or Jeff Atwood) we should test to see whether compilation makes a meaningful difference in_our_particular_application.

No. We should write our apps in the way that is most easily maintained, and not even worry about whether compilation would make a meaningful difference unless and until we determine that we need to speed things up, and then only after we’ve established by profiling that the compile time is what is slowing us down.

“We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil”

Donald Knuth

John_Rusk · March 21, 2010, 12:00am

@jdege

That’s the implicit test I mentioned: build it, and see if you have a problem. I did not mean to imply that was a bad thing to do. Generally, it’s my preferred approach too.

My concern with this thread is that in_some_circumstances LINQ to SQL compilation is not one of the “small efficiencies” which Knuth criticizes, it can be a make or break issue for an application. This thread seems to inappropriately trivialize that possibility.

I’m not advocating doing something “just because its faster”, at the expense of maintainability; but rather being aware of problems which may arise, so we can make sure to test and profile the application thoroughly enough, and early enough.

sayap · March 21, 2010, 12:00am

@Vince Panuccio

Jeff didn’t fully understand Rico’s post, and then misguided you with his incorrect interpretation. Rico’s post is about LINQ to SQL compilation, which happens at the client/app side. After that, whether a cached query plan will be used, is database/driver/config dependent.

AndreiR4 · March 22, 2010, 12:00am

@ RaymondC : “Does Linq2Sql produce queries with parameters or litarals? Do you have to compile those queries if you want parameterized queries?”

LINQ to SQL produces parameterized queries, not literals.

You do not have to compile those queries to have parameterized queries. Parameterized queries are the only output option.

FrancisF · March 22, 2010, 12:00am

Like Saint Jerry Pournelle used to say “better is the enemy of good enough”. (I miss Byte, I really do).

I work with Rails and HAML is supposed to be faster for dynamic content. But, it doesn’t look like HTML and designers freak out (sometimes). I lost work one time because the guy who was paying me didn’t understand it.

So, you pick HAML because it meets some other need. Raw speed is irrelevant, particularly if you’re cacheing (and you are, aren’t you?).

Like you say, Jeff - micro optimisation is sub optimal.

AbeM · March 22, 2010, 12:00am

Awesome post. I just started using LINQ to SQL at my new job and was curious how it stacked up performance-wise. Thanks for the info.

jdege · March 22, 2010, 12:00am

@John Rusk: “My concern with this thread is that in_some_circumstances LINQ to SQL compilation is not one of the “small efficiencies” which Knuth criticizes, it can be a make or break issue for an application.”

My concern is that folks who’d read this thread, when dealing with an application that clearly has inefficiencies that are a make or break issue, will jump immediately to the conclusion that since they’d read about LINQ to SQL compilation being a problem, it must be their problem.

It might be, or it might not be. There are plenty of other possibilities. It’s important that you identify that this is your problem, before you waste effort on trying to fix it.

The article in which Knuth made his “premature optimization” remark is available online: Structured Programming with go to Statements. The succeeding paragraph is instructive:

Yet we should not pass up our opportunities in that critical 3%. A good programmer will not be lulled into complacency by such reasoning, he will be wise to look carefully at the critical code; but only after that code has been identified. It is often a mistake to make a priori judgments about what parts of a program are really critical, since the universal experience of programmers who have been using measurement tools has been that their intuitive guesses fail.

Ric_Johnson · March 22, 2010, 12:00am

Microsoft has ended support for Linq To SQL and moved the team to Entities Framework. And most people prefer to us Fluent Nhibernate anyway because it is compiled.

Skippyfire · March 22, 2010, 12:00am

I have analyzed this a lot and I think YOU ARE WRONG!!!

(I apologize if I am way off on this, but…)

Think about it. Provided that you are caching the compiled queries at an Application level, you are NEVER running the query JUST ONCE!!! If you store it at an Application level, it is being run for every single user, every time they hit that page!!!

You DON’T create a new compiled query every page load and use it once. You compile it at application start up or when it is first used, and then used that compiled query in EVERY SINGLE CALL TO THE DAL!

Am I right?

JasmineA · March 22, 2010, 12:00am

No, you’re not exactly right Skippyfire… when you do the straight up LINQ thing, it gets JIT compiled… so it happens every time. That’s kinda the whole point about the article - that the JIT method is not so good.

Skippyfire · March 23, 2010, 12:00am

@Jasmine: Your statement is somewhat confusing, so let me be as clear as I can.

Compiled queries are ALWAYS faster. In an ASP.NET Application, when do you use a query JUST ONCE? Never! That query is used by thousands of different requests!

So the usage scenario of using a query “just once” never occurs. I hope I am explaining myself better.

As for this whole thing being a micro optimization, that might be true… but I still think the “performance tests” from Jeff and Rico are highly misleading.

Skippyfire · March 23, 2010, 12:00am

Oh yeah, you just need to make sure that the compiled query is being “cached”. So you need to store it in an application-wide location so it doesn’t get compiled every time.