From my experience SELECT’s almost never could cause deadlocks, no matter with or without nolock hint. You should carefully look at your UPDATE statements to see if they could cause locking.
Use Memcache to avoid doing db reads (often), which should drastically reduce the possibiliy of getting deadlocks, and you’re app will run much quicker.
I’m confused. NOLOCK is the default for select operations. I’d read this before and confirmed it now.
I get that, but I don’t understand why simple read SELECTS need to lock at all.
The classic example is the store clerk who uses the database to check his inventory to make sure he has a particular widget in stock before making a sale. He checks and sees one in stock, so he goes ahead and makes the sale. While he is entering his order details, another salesman finalizes an order for the last item. Now he can’t fill his order in time.
That explains the basic problem, but it doesn’t quite satisfy in this case. After all, the example implies an inventory transaction of some kind, and you aren’t specifically including any transacations.
An example closer to your particular app would be the sql required to display a particular question page. I’m not in the beta so I can’t see what it looks like, but from the description you must have some code to get a list of responses associated with a particular question for inclusion in the page. Not only that, but updating a question will always cause this code to run at nearly the same time as the update, so that the new page can be shown to the user.
Now let’s say part of the retrieval code joins in another table. If it joins anything at all, it needs to do some locking for both tables in the join to make sure the join is composed correctly. Otherwise something might change while the join is in progress that could change things. And if you kick off your update in a separate thread to avoid bogging down your display code, the chances of them needing the same data at the same time are actually pretty high.
I guess what I’m saying is that Starbucks Doesn’t Use Two-Phase Commit
…and Ebay doesn’t use transactions
OK, but why doesn’t SQL Server re-issue the transaction on my behalf, then? Automatic retries of some kind?
Because the transaction happens on the code, not on the database. For example:
x := select availability from inventory where item_id = XYZZY
y := select sum(quantity) from sale where item_id = XYZZY and sale_id = XYZZY
x := x-y
set availability := x
where item_id = Y
Ok, so you got through the first read and then deadlocked in the second. How can the database retry it for you? You already got the first read inside x. The database might recalculate that value, but it can’t tell your code to update x.
That’s why the code has to retry the transaction.
That’s also the why the read causes a lock. Suppose you read the availability, and then someone else tries to update it. If you hadn’t locked it in first place, how would you know, at the time you issue the update, whether anyone changed availability or not?
It becomes even more important if the database is not properly normalized, but as my example got big and confusing, I decided to leave it at that.
As it turns out, the database you are using can create deadlocks all by
itself, at any old time (well, there are reasons, but it’s complicated).
The only thing that can be done is to wrap each sql series into a
sequence that retries the problem – this is just a manifestation
of the class pc-losering issue.
It is likely that if it is really a read-only table, it won’t deadlock,
pending implementation details of the sql library. Throw an UPDATE
into the mix – you may well be screwed with a deadlock.
The implication is that ALL access (read, write, or update) needs to
wrapped with retry code.
So, go and wrap, be happy. Of course, if you run into a REAL deadlock,
things will get a bit ugly.
SQL Server is notoriously nasty about locking on reads. I run into these problems all of the time. NOLOCK can certainly help. Adding an index (or two) will usually resolve these issues. What often happens on a SELECT if it doesn’t get an index hit SQL Server will do a table scan. Table scans (in addition to being slow) may require a table lock. So you get one user thread accessing a table at a time. Not fun.
Sadly enough, admitting you aren’t a database expert draws the ire of DBAs everywhere. Be careful in admitting your own problems online, Jeff.
Hope this isn’t what we can expect from StackOverflow. I really love the idea of presenting a problem and watching it followed by a flurry of insults and criticisms.
Long, long ago, I think back in 2002, I had a similar problem with SQL Server. I was also bewildered that SQL Server, which prominently bragged about row-level locking, could have such common deadlock problems.
What I eventually discovered is that although the table data was row-level locked, the index data was page-level. Since I was usually inserting sequential key data and reads tended to be clustered on recent inserts, that meant that reads were frequently being locked out of the index due to nearby inserts updating the index. MS had no useful advice at the time on how to resolve this; all they could say was to minimize index access. I never could figure out how to reduce deadlock to an acceptable level while keeping the indices I needed to maintain query performance.
So I switched to Firebird and the problem went away.
It’s always a little disturbing to see a well-known coder ask a dumb question, but come on, database locks?
Wrong. Dead wrong and incredibly dangerous. And egotistical. If I were interviewing you I would immediately flip the bozo bit and thank you for your time.
It’s impossible to know everything. The hallmark of a good programmer is not what they know but their ability to learn what they don’t. If you do not promote an environment where any question can be asked, no matter how naive and trivial, problems become intractable because people are too afraid to ask for help. Just because someone has been in the industry for a while doesn’t mean they know everything.
It’s actually possible for two SELECT statements to deadlock in SQL Server (at least in SQL Server 2000), if an index on the table is involved. No writes required!
Deadlocks are an almost unavoidable fact of life in any large database application. If NOLOCK works for you, go with it, but you should understand when you can and can’t use it safely, and it sounds like you mostly do.
Are we all only web developers here?
Back in the day when we all wrote client apps, one common them (before disconnected recordsets) was that if you saw the data on your screen then it hadn’t changed.
i.e. You locked the data when you loaded that set of data into your view, and then released it when you clicked Next or Update or whatever.
NO LOCK is used in SQL Server for backward compatibility. To be really hip you want to put this at the front of your SELECT statement:
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED.
Put this on every SELECT you have that doesn’t do an action and you will speed up everything in your programs, applications, servers, and ssurrounding vicinity. Everytime anything is doing a SELECT it is locking tables.
And don’t do any SELECT INTO. That locks the TempDB. Build the table and then APPEND the rest.
Or one row SELECT INTO and then APPEND the rest.
How can a read be blocked by a write?
Well sure, either your update statement is still running, or the transaction within which the update statement had executed has not yet been committed or rolled back. By default SqlServer’s isolation level is read committed, so your query might read uncommitted data otherwise. Did you profile how long the Update takes to execute and whether its embedded inside a longer-running transaction?
But I can’t help thinking our particular database vendor just isn’t as optimistic as they perhaps should be.
Making a decision that is at odds with the rest of the technological world? This is Microsoft we’re talking about, right?
Also, a credit line for the theory vs practice quote that you hacked up a bit:
Sorry everyone. That link butchering is on server’s end, not mine.
Database deadlocks are not an error condition. At least they shouldn’t be, if the rest of your software is coded correctly.
The point here is that database deadlocks can and will happen all the time, they are just not all that likely if your transactions are very short as in a web scenario. But they will happen, regardless of what you do with READ UNCOMMITTED etc., at some point of time two users will try to change the same (or connected) thing at the same time, and there you have it.
If you do any database stuff, please organize your code so that conceptually linked queries and operations are contained in blocks that you can repeat on deadlocks. That is, when serving web requests, put your request handling in a method, and put that into a while loop that retries as long as you get deadlock exceptions. Safeguard against eternal loops by counting the number of retries and aborting if they exceed some N.
This of course requires you to be careful about any variables and other state that might leak from that retrieable method, but it’s the only way to do database operations correctly.
I’m a little disappointed that SQL Server treats our silly little web app like it’s a banking application.
Imagine how pissed you’d be if SQL Server treated your banking application like a silly little web app.