Embracing Languages Inside Languages

Martin Fowler loosely defines a fluent interface thusly: "The more the use of the API has that language like flow, the more fluent it is." If you detect a whiff of skepticism here, you're right: I've never seen this work. Computer languages aren't human languages.


This is a companion discussion topic for the original blog entry at: http://www.codinghorror.com/blog/2007/10/embracing-languages-inside-languages.html

If only someone could fluently resolve the mess of PHP, CSS, HTML and Javascript (cringe) that form a webpage…

Great post by the way, (and apologies for the double post) but I had an observation - as terrible as object wrappers are, I can see merit in not hard-coding SQL code throughout a project, for instance isolating the code that will need altering if the database changes although I guess equally you could have "SELECT * FROM " + dbName and still avoid object wrappers.

i don’t hate my job so much now.

Interesting points.

I can see one benefit such abstractions could give you though, the chance to avoid different implementations of the same language by abstracting that away to a library.

For example, we all know that SQL is slightly different on Oracle, MS SQL, MySQL etc. If someone had written the proper adapter in the random object filled language, we no longer need to worry about the lower level (or restricting ourselves to the lowest common denominator) SQL implementation after loading that adapter.

Still, I agree with you, most such levels lack elegance, and hide too much of the detail that every developer should know.

The subsonic example you gave looks just like what the ‘pure’ C# 3.0 looks like (ie a bunch of rather handy extension methods for IEnumerable). The SQL-like syntax is just an ‘added extra’ on top of this. Personally I prefer the non-SQL-like syntax, because it offers a much clearer picture of what the compiler is actually doing, but I guess that’s a matter of personal taste (and is probably related to the fact that I’m more often querying in-memory collections or XML documents than I am database tables).

Incidentally, your LINQ example relies on an autogen Customer object which maps to (presumably) a Customer table in the database. The four points of “rationale behind these types of database code generation tools” absolutely apply to LINQ as well.

I’m not familiar with LINQ, but can you do something like this (excuse the pseudo-code)?

var baseQuery = from Customer in Customers select id
var firstId = baseQuery orderby createdAt limit 1

I think the strongest argument for making something fully object-oriented, with a real live grown-up API is that you get to use standard OO methodology to get things done. So I’m all sorts of OK for making my language have SQL, XML, etc. embedded in it as something other than clunky APIs or landmine strings, but I’d want that to be syntactic sugar on top of a standard object-oriented library.

Now if only someone would get creative with a packrat parser and a compiler…

PS. totally agree about the regex example. Show that to any decent programmer that wasn’t brought up on java or C# and see how they laugh. Regex only looks intimidating; in fact it’s pretty simple, and the basics can be learnt in under an hour.

Basically agree with you here, Jeff. In spite of my lesser abilities with Regex, I do see the benefits. It’s on my TODO list to master them. I see it as a sort of math for strings:

a = (b + 1) * a ^ 2

is a lot better than (in fake OO code):

a.valueOf(b.plus(1).times(a.pow(2))

There seems to be a tension between expressiveness and fluency.

I like:

a = [:] (Groovy - I’m sure other dynamic languages have something similar)

a lot better than:

import java.util.HashMap;
a = new HashMap();

Since groovy adds a whole bunch of symbolic operators to handle collections it actually makes the code easier to write and easier to read ionce you know the inner language/i.

On the flip side, it would be easy to get carried away with the stuff. Operator overloading has a bad reputation for just this reason. A little goes a long way.

Regex, SQL, and within Groovy, the collection operators, are universal enough that learning a domain specific “inner” language is certainly worth the trouble.

Regards,

Matt

I think your RegEx example is valid because regex is very similar across all platforms. The problem with the database example is that databases are not similar across all platforms. Even that extremely simple SQL sample you wrote won’t run in SQL Server. If you want to support multiple db’s then somewhere that SQL code has to be abstracted away. You could put it in stored procs and that works for many scenarios, but even stored procs are vastly different from one DB to another. And stored procs are a pain when trying to create adhoc queries which is where the subsonic or LINQ can be useful.

Jeff - SQL is a string, not a line of code and just writing it doesn’t pull it from the DB into your app. The lines of code you omitted do that :p. But you raise a good point… anything can be abused yes?

For what it’s worth, you can use a Query (1 line) to do this:

IDataReader rdr=new Query(“Customer”).WHERE("Customer.Columns.Country, “USA”).OrderByAsc(Customer.Columns.CompanyName).ExecuteReader();

Coda Hale:

Absolutely, yes. The result of a query is itself queryable, so queries may be chained in the manner you suggest. In fact, behind the scenes that is exactly what the compiler does. So you could say something like:

var ids = from c in customers select id;
var idssorted = from id in ids orderby id select id;
var idfirst = idssorted.First();

(You can think of queries as monads, if that helps.)

To the guys talking about abstracting away the underlying database:
I understand that it seems like a solid goal, to abstract away the DB layer through a middle-layer… but think about this:
How often does your app change underlying database?
When it does change, doesn’t a giant portion of the code end up having to be rewritten anyway?
How much time is spent abstracting things to the point where you see no SQL, would that time be better spent elsewhere?
What kindof performance hits do you take by not utilizing performance enhancing / time saving features of your DB are you avoiding in the name of abstraction?

I think there’s a point of diminishing returns in abstracting everything to that level… By the time we need to change the underlying DB, I can get actual time allocated to do the conversion, whereas the initial creation of the project seems to have a more strict deadline outside of my control… Also, if you’re changing your underlying database system, that usually indicates that there’s more wrong with your system and it’s about time for a app-wide refactoring and/or rewrite.

Now maybe if you’re only using the database for persistence, instead of treating the data as the top priority, abstracting the db layer extensively may make more sense.

Just some observations,
Cheers!

That code is in no way object oriented. Simply putting methods inside a namespace and calling it an object is not object oriented programming.

Check out Seaside for an object oriented way to generate HTML. Now THAT is object oriented.

Jeff: Thanks for the shout-out. We are very proud of LINQ.

Note however that I don’t think of LINQ as “embedding SQL in C#”. Rather, I think of LINQ as “providing abstractions for the ideas of sorting, filtering, projecting and grouping which work across arbitrary data”.

That those operations are particularly useful for SQL Server databases, and that we have a particularly clever transformation from LINQ expressions to SQL is of course delightful, but that’s only a part of the power of these abstractions. We want to be able to apply these abstractions to ALL data, whether its stored in XML, a database, arrays in memory, arbitrary object graphs, web services, whatever.

While I appreciate the sentiment that “good developers should just buckle down and learn regex’s” the fact is that while they are a language, there is really nothing in them that jogs a coder’s memory as to how they work. I can’t tell you how many times I’ve had to relearn regex’s because their syntax is completely unhelpful to someone who is not using them on a daily basis. There have been times when I was more or less fluent and after long periods of not using them, even things I’d written were incomprehensible garbage to me. I agree the object oriented approach above is really clunky, but essentially someone who has never seen regex’s can pretty much understand what is going on.

Your point seems to be that these workarounds make the code less succinct, which is true, but the reason they exist is to improve clarity and maintainability, not brevity.

@intangible - well I’ve rarely switched a database midstream, but when evaluating a web-based product to install on my servers, I do have to make a choice of a database up front.

Open source projects such as blog engines deal with this a lot. You might want to run my blog engine on MySql while I run it on SQL Server. If I embed SQL all over my app, then I’m stuck.

Thus, I need to abstract away the database. LINQ is actually an abstraction that looks closer to SQL, but isn’t SQL.

perl supports the /x flag on in-line regular expressions, letting you put extra whitespace and comments in the regular expression for readability.

Languages that don’t have native regular expression support, like PHP, where regular expression literals are really run-time parsed string literals, are a bear to work with. You have to contend with the double meaning of backslash when using pcre. There’s a lot of odd docs/notes/hints on the PHP website about using double-backslashes in your pcre string literals when quoting your strings with double quotes (however, it fails to mention that this ceases to be a problem if you single quote enclose string literals that contain regular expressions).

Javascript is interesting, in that it doesn’t have the same kind of variable interpolation that perl does, so you can’t interpolate variables into regular expression literals (which evaluate as regular expression objects), but you can pass a string (perhaps as the result of a series of concatenations) to a regular expression object constructor, and get the same kind of regular expression object back.

Hey Now Jeff,
I liked this post. When reading the beginning I was thinking how classic asp would compare to LINQ with VB.NET or C#. I really liked that example when I read it (the one with Perl).
Coding Horror fan,
Catto

This really should be 2 posts.

Some languages are never pretty and difficult to write so that they are relatively easy to read. On the other hand, some people go out of their way to make a readable language unreadable. Regex expressions would fall into the first category.

LINQ will find its way into mainstream, unfortunately. For the sake of “object”-tiveness and Intellisense if nothing else. Once again we add a layer of abstraction (and ultimately less efficient SQL), and make our apps a little more brittle.

Your C# 3.0 example is mindlessly simple, and not a fair representation of what LINQ can or cannot do.