A Google search for [sql server tables] returns 216 times as many hits as [mysql tables]. A search for [sql server database] returns mroe than 6 times as many hits as [mysql database]. I’m not sure that I would draw any pro-MySQL conclusions from the statistics you present.
You could try caching (if not already) to eliminate some of the database reads, assuming that it’s not dynamically changing data.
I’ve never ran into this sort of issue and I have some database heavy sites… apparently not heavy enough though.
As mph said retaled to the fact:
I think it’s incredibly telling that a Google search for SQL Server deadlocks returns nearly twice the results of a query for MySql deadlocks
you will need how many deadlocks entry per user you find at Google, not the overall.
Great post jeff and thanks for the explanation on the blogging frequency; I was becoming a little antsy about it. I understand now. I’ve seen this exact problem on production systems and I had no idea what was wrong, now I at least have a chance to fix it.
You’ve probably got a highly normalised database and a long running SELECT, which is backing up the UPDATE, which, in turn, is backing up the other random SELECTS.
By the time you get around to looking at the queries, the long running guy is finished, the UPDATE is more important and your subsequent SELECT gets killed.
It’s a common DB design problem.
Take a look at the query log and see if you can find the root cause SELECT.
Your drawing shows something like 6 philosophers and a guy that is maybe serving. There is not enough chopstick. I’m afraid but you might not be able to draw or choose good drawing anymore after Sql Server battle
I already read article where you where complaining about Sql Server. Do you think Mysql won’t be any better ?
And yes i saw that you had fewer time to blog.
Why not post it as a question on StackOverflow, Jeff?
A couple of days ago, I watched a talk on the University of Washington public access channel about some research into resource locking. This talk is specifically about resource locking in the context of multi-core processors, but it applies to the whole class of problems just as readily. The solution proposed by this professor (Dan Grossman) is to build the handling of resource locking into the compiler, the same way that garbage collect has been integrated into comilers over the last 15-20 years (do you remember the days before garbage collection? Blech!)
Jeff, I know this doesn’t help you guys now, but it is food for thought.
Summary from the UW website:
With multicore processors bringing parallel computing to the masses, there is an urgent need to make concurrent programming easier. Software transactions hold great promise for simplifying shared-memory concurrency, and they have received enormous attention from the research community in the last couple years. This talk will provide an overview of work done at the University of Washington to help bring transactions to the next generation of programming languages.
The program is available for download at http://www.uwtv.org/programs/displayevent.aspx?rID=22341fID=1471.
It’s always a little disturbing to see a well-known coder ask a dumb question, but come on, database locks? This is very basic stuff! RDBMSs, in particular, are intended to preserve data integrity in addition to enabling fast read/write access to large data collections. Otherwise file systems could be made sufficient.
In your case, since your innocuous SQL statement doesn’t grab the parent along with the child posts, you could easily have a difference between the answer count and the # of children actually returned. Of course it’s not a huge problem that you should spend sleepless nights on, but it will lead to a bug that is just shy of impossible to reproduce or debug.
Wasn’t the diner a drive system for some ship in one of the Hitchhiker’s Guide to the Galaxy books?
I can’t really think of a justification for the out-of-the-box behavior of SQL Server as you described. As far as I know Oracle always uses snapshots for reading, so a simple SELECT can never cause a deadlock.
Come to think of it, why are you using SQL Server at all, and not something free (beer or speech, take your pick)?
Nolock is probably a reasonable solution in this case, but I’d like to point out that a deadlock can only happen when multiple resources (usually tables) are involved in simultaneous transactions. If you’ve got a SELECT that operates on tables A and B and an UPDATE that operates on the same ones, then the SELECT will grab a lock on table A, the UPDATE will grab table B, and then both will sit there while they try to get a lock on the other table. Or if you’ve got a block of statements in a single transaction, you’ll get the same effect.
If your app is denormalized to the point where you only need a single table for most queries (as many simple web apps are), you shouldn’t see deadlocks. You may want to check out your transactions and make sure they’re not doing more than necessary within a transactional block.
ceejayoz: So bitch at Jeff, since I modeled my queries on his. If my sql server queries are contaminated with MySQL results, then so are his, and my point stands.
You could use CouchDB, which uses optimistic commits and never ever deadlocks.
The issue is probably going to come down to design. Is the Id/ParentId a GUID and does the table have a lot of rows? Is that select resemble the real select that is happening? Do you have indexes including a clustered? Are you sure the deadlock is caused by the select? Are you issuing the select and update in different connections?
The idea that MySQL has less deadlock issues then SQL Server based on a google search returning less results does not tell much. SQL Server is used for many more real applications then SQL Server is what it tells me.
If you haven’t found the exact reason why your deadlocks are occuring maybe you should try this method:
this way you don’t have to worry about running the profiler or anything else. when the deadlock happens you get the entire deadlock graph saved to a table and get a notification my mail if that’s what you wish.
This method has helped me trace a few bastardly deadlocks with great success.
hope it helps.
If your database allows you to set the isolation level as serializable, and supports MVCC, you can gain all of the safety you would need for a full banking application without getting any of the locking problems.
If the update that updates the count was still running, as long as you were careful to have the actual update of the comment in the same transaction, it would always be in sync, the comment just would not appear until the count had been updated.
A better question, however (Cobb would be proud) is why in the world are you storing the count of the replies on the post record??? The last activity and count are easily calculated using aggregate SQL functions.
There are, in effect, two issues.
- Your database either cannot support concurrent use well, or it is not configured correctly.
- You schema design looks to have some issues.
I’m no DBA, but this feels like a hack and hacks usually lead to headaches later on.
I’m a big fan of using with (nolock) hint by default.
That doesn’t mean that all queries should be written with nolock, but most of SQL queries should.
Especially for web app.
In some situations you want to be sure that your data is consistent no matter what. In this case you should use different locking hints. Which ones – would heavily depend on the situation.
You also need to clearly understand what exactly locking does.
The locking model is relatively simple, the problem is that most of locking tutorials are just terrible.
I wrote the tutorial about how to learn about SQL Server Locking: