Reddit: Language vs. Platform

Look at a few frameworks and pick a well supported one which :

a) Provides a simple toolkit of single-function tools you can use to Get Things Done.

AND/OR

b) Does more of what you want ‘out of the box’ than the others.

Then get about doing whatever it takes to finish the project. Scaling will push any framework. You will need to work around bottlenecks regardless of the platform.

Does this have anything to do with php?

I ussually see php used in forums in stuff. Though I never
knew ‘Python’ and ‘Lisp’ were web application languages

At the end of the day much (most?) of it is about reducing network traffic and disk activity. Twitter looks like it demands a lot of each, and therefore the architecture must be chosen and designed to support this.

As an aside, I clicked on a few links in Twitter, but had no desire to return. Am I missing something obvious? Why is this site so busy? Maybe I’m too old to appreciate it?

This is why I’m very excited about tools like JRuby. Java, for better or worse, has a munch larger “platform” than any other language in common use today. There are libraries and frameworks available to do just about anything imaginable. Integrating a “better” language with the Java platform automagically gives that language all the power of that platform.

Well written post, Jeff. To be honest, I thought the previous one was a bit trollish in light of all the hullabaloo over Twitter vs DHH vs Python vs ad nauseum.

In any case, Joel Spolsky also lies in the “it’s the platform, stupid” camp. But I proposition that it’s properly called the environment, not “just” the platform. The environment encompasses all that a project lives in, and what makes a developer’s life heaven or hell. The IDEs, the OS, the databases, the framework(s), the online community, the forums, the documentation, the language, the sugar, etc, etc. Suffice to say, there is no perfect environment, just trade offs.

So pick your battles, write your software, and go make money!

We like money.

Having recently spec’d and designed a project to be build upon Rails I did much research into the framework. It’s nice, but as I dug deeper I was realizing that there would be scalability issues as well as some not so “out of the box” functionality that would need to be implemented. I recommended ASP.NET (As for all you MS haters out there, of which I count myself one too, .NET is really nice. I just love linux and the mono-project.com guys rock) To this day they are still struggling to implement the project since the developers they decided to hire wowed them with “look how fast we can develop in ROR”. Don’t get me wrong though I got pretty excited about ROR and was able to slap together a nice test site, and I still like it. The point is ASP.NET made sense for what they wanted to accomplish, but they got swept up in the shininess and are now paying (a lot) for it and they’re not even to the third phase of the project.

Also if you want another ASP.NET (cross platform) “framework”/CMS see mojoportal.com.

nitpick: “dynamic languages like Ruby, Lisp, and Python that will never be known for their high octane, nitro burnin’ performance levels”

Some Lisps are known for extremely good performance: there’s fifty years of compiler technology hidden behind all those parentheses. :wink: Other dynamic languages also sport tremendously efficient VMs – Smalltalk’s Strongtalk VM comes to mind as a great example.

I see that language performance wasn’t the point of your post, but we shouldn’t perpetuate the fallacy that “dynamic” must equate to “slow”.

You should continue to write. IMHO you make sense sometimes, and the ‘sense’ is not necessarily all in the same post all the time.

The comments here are fascinating (to me). People seem to have some odd conceptions about Rails as they attempt to invent performance problems for it. “Typically this is because multiple requests are made to the database in order to follow the object hierarchy, rather than making a single request to get the one piece of data you need to fulfil this request.” Rails is all about taking common problems like this and making them easy. If I want to join data from another table in a query result, aka “eager loading”, it’s pretty simple. You can even nest them, etc., it’s a beautiful syntax (and saves tons of queries along with keystrokes).

#given a users table that “has_many” events
#join events based on a defined fk user_id in a table called events
the_user = User.find(id, :include = :events)
for event in the_user.events
puts event.name #requires no additional trip to database
end

I also find statements like this odd: “While in C# for example I can doing a string.split and basically create an entire array with 1 line of code. For ease and productivity you lose out somewhat on performance.” Is this making the assumption that the programmer can write a more efficient split function than the authors of C#? Could be true, but in my case, it most likely is not true at all. Plus, mine would have at least one hidden bug in it… I’ll use the split function, thanks.

Same guy writes- “This is where a DBA can come in really handy to help write the most optimized and sometime ugly SQL. It is quicker to write the SQL statement in the “code layer” but really it belongs in the database layer” Really? First, why can’t the DBA help write the SQL in the code? Second, how are you going to call your SQL that is in the database without SQL (or SQL generation) in the code? I’m not saying you can’t go the route of lots of database code and get good performance, but it’s certainly not the only way.

Jeff,

Big fan of the blog. You do an excellent job.

A friend of mine and I were talking about your last two posts and it occurred to us that no one has really taken the time to analyze the bigger web2.0 companies with a write-up of their platforms and languages.

Maybe this is something you’d be interested in doing. I know I and probably others would be interested in reading.

Keep up the good work!

Hey Jon-

Love to have discussions in comments on someone’s blog! Fun. I see your point now, in terms of database server utilization versus raw query performance. I’ve just had the opposite experience with stored procedures for crud operations on one huge system I worked on, but I have come to understand that I was working with a really bad DBA that didn’t understand she was killing the overall application performance by writing stored procedures that returned stuff in a format that was inconvenient for the application, just so her code would run faster. Local optimization problems…but what made me wonder is that I usually ended up wrapping the stored procs in a select so that I wouldn’t have to pull back 2 blobs in multiple rows on each query. Would that kill the gains you mention? Or is it things like joins that you have found to run faster in stored procs versus sql queries?

Jeff, as another Lisper who generally enjoys your posts very much, I have to object to the factual error of lumping Lisp with Python/Ruby performance-wise. Whatever else Lisp may be, it’s not slow; performance can be comparable to C/C++ and in some cases even better. See the following for starters:

http://lemonodor.com/archives/000180.html
http://bc.tech.coop/blog/040308.html

I can’t comment on why specifically Twitter is having performance problems, but in general you can make this assumption: The closer the underlying code is to the machine layer, the faster it will run. There is no doubt about that. Now if the code is not optimized all bets are off. So why use languages that are not close to the machine? Supposedly they will be easier to use and implement. For example, in assembly each operation is a very small amount of work being performed so it will take a lot of code that is hard to understand (read). While in C# for example I can doing a string.split and basically create an entire array with 1 line of code. For ease and productivity you lose out somewhat on performance.

Whenever the word abstraction is used, performance will suffer. I beleive someone posted that some of the tools (Rails for instance) abstracts out the database. Now if you are writing an application with database operations and there will be a lot of users, be prepared to write some tedious data layer code. Even the small (Select * versus Select column1, column2) etc. helps out. This is where a DBA can come in really handy to help write the most optimized and sometime ugly SQL. It is quicker to write the SQL statement in the “code layer” but really it belongs in the database layer. I’ve seen plenty of applications that dynamically build SQL and send it in to the database and this will work for small amounts of users, but this is not the way to go for any large scale application with very specific performance needs.

There is always this discussion about how to tie the code and the data layer together and frankly it is nonsense. Let the database do its work and comunicate back the result sets to the code layer. As long of the communication layer is nice and fast lets not try and combine the two.

For large systems, stored procedures are the only way to go. Stored Procedures versus in line SQL, performance will be about the same with a small amount of users. But where it really pays off is large amounts of database operations. This is because with stored procedures, the database engine doesn’t have to work as hard to deliver the results. Less IO, memory, etc. means more scalability. So instead of handle a 1000 users, now the DBMS it can handle 10000 users with same CPU and memory profile.

Every application will hit a wall, but if you can support your users with 50 servers instead of 500, then you have earned your paycheck as an developer.

To put a variation on Box’s famous quote - All frameworks suck, some are useful.

Hey Matt M-

Thanks for the comments. I used the C# string.split statement to show that with frameworks such as .Net, you can do more with less code. What I was trying to convey is that you don’t have to know the details or how the string is being manipulated and put into the array, you just have use the method and get back an array of string parts. If split is implemented badly, then the framework is giving your application a performance problem. I guess I gave the wrong impression that native string.split versus your own implemenation would give a performance boost. My point was that the framework will give you increased productivity but it may impact performance. You have to trust that all the implementations are good.

Second, there’s nothing wrong with putting all database related activities in the the code.

However, if 1 implementation uses all stored procedures and the other does it SQL calls from the code layer, the one with stored procedures will not put as much work on the DBMS system (memory, CPU, etc) as the one with the SQL in the code layer. The system with stored procedures will scale much better than the one with ad hoc queries. On one application I did maintenance against, all SQL was written in the code and the DB Server (Mid Tier Unix Box) was working really hard to keep up. So we analyzed some of the ad hoc queries that were going in versus stored procedures and the execution time was about the same (very small differences in hundreds of a second to return results) however the memory and CPU usage that the DBMS system was putting on the Unox box was very different with the stored procedure utilizing much less resources than the ad hoc query.

That was my point, stored procedures will work the database server less.

Whether or not you choose to dynamically generate SQL is up to you, but any query, update, or delete statement can be parameterized into a stored procedure.

If one has implemented some sort of dynamic database schema, then stored procedures may not be possible. For example, say you want the users to be able to add a data field to the system automatically with actually adding the column to a table in the database, you can do that with a generic data model and query that model with ad hoc dynamic queries, but the price you pay for that flexibility is a loss in overall performance.

If the DBA isn’t on the same page as the development team, then difficulties will arise. There isn’t a lot you can do if the results that are being returned from the query are not what you expect. For example, if I want colums 1 and 2 from table 1 and columns 3 and 4 from table 2, then I expect the query to result those results so I can put them in a data set, date table, etc. with minimal additional coding. If I have to cycle through the result set and process each row again to get the data I am looking for, then basically your doubling the amount of processing.

Also if your stored procedure is written badly it can also performance problem. Using stored procedures in of itself isn’t going to magically make the application faster.

Sometimes it all comes back to application design. If you have a query that is joining many tables in a complex query (muliple joins) it might be better to redesign or create a view that flattens out the data. Every query can be examined with tools (SQL Server - Query Analyzer, Oracle - Toad) to determine its execution plan. Execution plans show what the database will do to get the data. A good DBA should be able to analyze the execution plan and determine if the query can be optimized and rewritten to achieve better performance. On a large system, this is part of the DBA and maintenance developer ongoing maintenance.

In your siutation, it sounds like there is disconnect between the coder(s) and the DBA. My general advice is: Write the query, analyze it’s execution plan with DBA and determine courses of action to make it better. In my experience this doesn’t happen too often which means that maintenance on such systems becomes a headache.

It has been several years since this article was posted. Looking back on it now, one section is ironic for stackoverflow.com:

Your users don’t give a damn what framework and language you’re using. The only people who care about that stuff are other software developers. And God help you if your users are software developers; then you’re really in trouble.