All Abstractions Are Failed Abstractions

I wonder, for your sample table / query, if nesting it works better.

SELECT * from table where ID in ( select top 48 id from table where [clauses] )

Theoretically gives the sql server a chance to optimize that it doesn’t get from your independent queries, and gives some additional hint that you don’t actually want everything from all entries then filtered to the top 48, but just everything from the top 48, which are subtly different.

LINQ Rules! I can’t believe developers are still writing SQL Strings and using SQLCommands and SqlConnections. I’ve been able to develop two applications in the time it would take me to develop just one, if I weren’t using LINQ.

a lot of people miss the entire point of encapsulating data access with stored procedures - it’s (also) security. DBA’s (myself included) don’t want user app’s to have direct access to the tables. Using Linq requires that, so security is out the window. Using stored procedures can not only provide more efficient data access, but provide for a designed security layer.
-dennis

2 Dennis Parks: thank you! finally someone pointed this out! I was surprised it took so many posts…

While I place importance in the performance of any piece of code, I must admit that one of the driving forces in using heavily abstracted technologies – that by their very nature add heaps of overhead – is the client.

Clients are less interested in how properly coded their product is than the time-to-production.

I just went through an application that was using a vanilla datagrid to display a a paged result set, but the paging was being done client side and as a result a quarter hundred thousand records were being pulled from the database each page, while onlyl a hundred or so were being displayed.

My guess is that the original developer had not thought how many records there were going to be, instead focusing on the business logic around the abstracting control (the datagrid).

This abstraction was causing intermittent problems with resource usage (although I have not explicitly proved this, I think it is most likely), it had finally leaked.

Now this abstraction may have saved a few hours of coding but in the end it caused more hours of support work, and some user dissatisfaction.

It’s not always clear when your abstraction will bite you in the @$$ at the end of the day, but it is important to know the pitfalls your abstraction will have, and when they may occur.

I just hate the phrase “shoring up.” It is such a Palinism. And also (intended), I think you could have just linked to Joel’s original article and been done with it.

And also (again, intended), there are many, many much better tools than LINQ-to-SQL. I think this stuff is just throwing stones at glass houses that both save more time and commit far less of these kinds of mistakes.

@Dennis Parks - LINQ2SQL can use stored procedures:

http://weblogs.asp.net/scottgu/archive/2007/08/16/linq-to-sql-part-6-retrieving-data-using-stored-procedures.aspx

Why not take your abstraction to the next level and sit your application code over an object model by using an ORM like JDO or JPA?

Then you get to use inheritance, polymorphism and interfaces (all forms of abstraction provided by all good OO languages) which are the crucial tools used to prevent code bloat in large applications.

You’ll also find the amount of SQL code you have to write will drop to virtually nothing as the ORM does most of the mind numbingly, ‘boring as batsh*t’ SQL ‘glue code’ for you automatically - and yes, it even performs well, very well!

Why would anyone want to use LINQ. I refuse to use it. I’m going to select data using real TSQL or PLSQL so I can get EXACTLY what I want and be done with it. Unbelievable how people will jump on the MS bandwagon for any new thing that comes out. Just makes me sick actually. I’m telling every potential employer the facts about LINQ - how it sucks so bad - and how I refuse to use it in my applications.

Why do you want to abstract something simple like SQL? Just makes no sense.

Jeff, what the hell?

You’re getting a “select *” in your SQL because you used the IDENTICAL “select p” in Linq. Change the Linq to “select p.Id” and you’ll get just a “select Id” in the SQL.

This isn’t a leaky/failed abstraction at all. Please tell me that you actually knew this all along, but just chose a really terrible example to make your point?

I get the whole thing about indexing and datatype sizes and yaddayadda, but if that is your point, then you have the exact same problem writing actual SQL without the abstraction. You can only deal with those other issues if you have control over the database design itself.

It’s unfortunate that legions of developers will scan the post, not read the corrections in the comments, and will endlessly parrot “spolsky and atwood said not to use LINQ to SQL”.

As several other people commented, if you only want certain fields, you should select them. I think the point you were getting at was that the ORM should figure out what fields you’re eventually going to use, but that seems like it might set off some NP alarms.

I question why Jeff is using Linq at Stack Overflow at all?

Is Stack Overflow ever going to run on a non-sql server platform?

I would only consider Linq if we needed to support more than one DB platform. Running your own site on a specific platform doesn’t meet that criteria. In that case you are better off writing Stored Proc and maximizing database performance and dealing with SQL and the RDMS engine directly.

Linq to SQL is an abstraction? Not by my definition of abstraction. It’s a domain specific embedded language that allows you to write SQL in C# - the C# AST fragment that the LINQ expression is translated to has a one-to-one correspondence with the SQL that is generated.

Everything you do is an abstraction. C# is an abstraction of the CLI. The CLI is an abstraction of C. C is an abstraction of assembler. Assembler is an abstraction of the microcode. If you want no abstractions, the only option is a field programmable gate array. Yes, they are fast. The point is that you need to decide whether the performance hit and leaks are worth the reduced programming effort.

The abstraction argument is so pathetic (IMHO).

What is sql, c#, filesystems, etc except one big fat abstraction after another on top of ones and zeros.

After seeing LINQ used for database abstraction in a large project, I think it was a huge mistake (although LINQ expressions are very useful).

A better abstraction, I think, is to have a RESTful interface over HTTP that applications use. (The actual RESTful queries would not need any abstraction between it and SQL, and would be closely coupled to the schema.) But all applications downstream would essentially have a simple, clean, domain specific interface for manipulating data objects.

This doesn’t solve all data problems – certain schema changes would require changes to the RESTful interface. But if done right, I think this abstraction is about as leak-proof as you can get, when dealing with databases.

As a developer, I simply cannot do my job unless I understand not just database structures but how the database works. To be honest, I need a good understanding of the underlying OS and storage layouts as well. This needs to be at a more than rudimentary level of knowledge and employers are beginning to realise that the best developers are those people who have a decent amount of experience in all these areas.

Abstractions will be at their most useful when we break down the silos between developers, DBAs and Admins and we are all able to understand exactly what it is the abstraction is hiding.

At this point, abstractions will still be leaky, we will just understand the leak and probably either avoid it or take advantage of it.

Abstractions tend to work better if one starts from concrete then abstract on-demand and refrain from going further until needed.

Anyhow, I think this LINQ problem could actually be a case of optimizing at the wrong level because moving a time consuming task out of time constrained path is simpler and longer lasting than making it fit now and having to revisit again when requirements change, usually negatively.

You say all abstractions are leaky. This is true but some abstractions are more leaky than others. You don’t explicitly say it, but SQL is in itself a very leaky abstraction, too leaky. It’s the big mistake of backend web development. Why was one seemingly identical query much worse than another? SQL is an abstraction over imperative code, using SQL as an api over a database abstract decades of algorithm theory away from the user. If databases are the bottleneck of web development then the api to access this bottleneck should have as little leaks as possible meaning that if our computers are imperative machines with imperative instructions then the api should be imperative as well. Who in there right mind decided that the api should be an SQL expression? Of course it’s going to be leaky.

Then someone tries to be smart and put another abstraction (LINQ) over SQL to try to make it imperative again. Genius.

The solution to fix the problems with LINQ is of course to put another query/expression based abstraction over it.

SQL is in no way an abstraction; it’s a domain specific language designed specifically for sets and databases.