Protecting Your Cookies: HttpOnly

If your forum allows HTML, There’s no substitute for a real HTML parser.

Ok - if you update the MSXML Core Services at http://www.microsoft.com/technet/security/bulletin/ms08-069.mspx then IE 8 Beta 2 will prevent HTTPOnly cookies from being read by XMLHTTPRequest (headers) within IE. This is an obscure vector, but IE 8 Beta 2 is the only browser that truly stops set-cookie leakage in headers via javascript. However, to get really crazy, ie 8 beta 2 with MS08-069 still leaks set-cookie2 HTTPOnly cookies in XMLHTTPRequest headers.

FireFox is on track to fix this obscure vector, completely. The FireFox patch at XMLHTTPRequest is marked RESOLVED FIXED and will go live shortly. (https://bugzilla.mozilla.org/show_bug.cgi?id=380418)

Even Safari/Chrome will see complete set-cookie/set-cookie2 XMLHTTPRequest exposure (https://bugs.webkit.org/show_bug.cgi?id=10957) protection shortly - the patch is complete as of 12/21/08

Final really obscure note, the OWASP WEBGOAT HTTPOnly lab is broken and does not show IE 8 Beta 2 with ms08-069 as complete in terms of HTTPOnly protection. However, Robert Hansens’ test page now includes set-cookie and set-cookie2 checks for XMLHTTPRequest exposure and should be used until OWASP fixes http://code.google.com/p/webgoat/issues/detail?id=18

Or you can do something crazy like… Oh, I dunno, not trust the client for everything.

When I create a session on my (PHP) site the client receives two cookies: ID and HASH. The ID in the database of the session and a random hash (md5(constant.time().mt_rand(1,9999999))) which is associated with the session and used in looking it up.

From there, I also generate a ‘security hash’ server side with each request that is basically: md5($_SERVER[‘REMOTE_ADDR’].$_SERVER[‘HTTP_USER_AGENT’].$_SERVER[‘HTTP_X_FORWARDED_FOR’]); This hash is compared against the security hash stored upon login. If it doesn’t match, you don’t associate with the session. On top of this, there is an inactivity timeout and and absolute timeout on each session (1 hr and 3 hrs respectively).

So, in order to hijack a session, you’d have to obtain the ID and HASH cookies as you described (or through some other method), manage to fool my web server into thinking you’re using the actual client’s IP, and realize what the final piece is and forge your user-agent to match the actual client’s. All within at MOST 3 hours.

CSRF, on the other hand, this provides no protection against :slight_smile:

Any chance you’ll update the refactormycode snippet to include the security fixes you put in as a result of this rude awakening?

Don’t forget those of us with dual load balanced internet connections, in your proposed solutions. Every other request comes from a different IP address (a set of two, in this case).

Well, yeah. That’s what happens when you think a sanitiser should try and clean the input. Another approach is to run a full HTML parser and construct a DOM tree from the document. Filter said DOM tree. Regenerate HTML from this tree.

Valid HTML will get through unscathed. Slightly incorrect but harmless HTML may even end up fixed. Bad will either end up filtered out of the DOM tree or so mangled that the XSS attack won’t work (trickery like what’s just given will fall flat and probably turn into img src= )

I see a bunch of misinformation and misunderstandings flying around here.

First, HTTP connections are over TCP, not UDP. That means the IP address of an HTTP connection can not readily be spoofed against any system with good TCP sequence number randomization, which in turn means unless your server is running on some absolutely ancient OS they should not be spoofable. IP spoofing over UDP = easy, IP spoofing over TCP = hard. That means that it would indeed be useful to encode the IP address as part of the session cookie.

Second, to Jeff: Is your core intent here to build a working website, or to show the world what a macho programmer you are? If it’s the former, you damn well should be building on solid components, and you certainly should consider a well-tested input validator and sanitizer as one of them, if you could find a suitable one.

Third, the concept of input sanitizer is highly questionable, as a couple people said; this is a good example of why. Trying to helpfully clean up toxic input goes right along with trying to remove that virus from the application document before you pass it along to the user. Don’t sanitize bad input; reject it (preferably handling it with gloves and tongs in the process.)

Fourth, even input validation should be based on matching and accepting a limited (as in brain-dead-simple) subset of valid constructs, not on attempting to match and reject invalid constructs. Anything which doesn’t clearly match a limited set of valid values should be tossed (or in a posting context, fed back to the sender with an invitation to correct it.)

There is a lot of well-developed and hard-earned wisdom about how to write security-conscious software. It starts with learning about the topic, and then not rejecting basic principles because I want to do it my way!

You’re expecting a lot of people to trust you here; time to step up and live up to that trust.

So instead of actually fixing the problem (by, say, using a real HTML parser/sanitizer and getting rid of scripts), you’ve chosen to put on a second band-aid which doesn’t even work the way it’s supposed to half the time.

Well played. Don’t bother trying to cure the disease, just treat the symptoms.

This is proof positive of the importance of creating a good design at the very beginning. Not only is it hard to fix mistakes in the design later on, but developers and geeks in general are ridiculously stubborn and can’t bear the idea of having made a serious mistake; they’d rather just patch it up one way or another, until the patch fails and they have to make another patch, and so on and so forth. Not a good situation.

Kris, authentication cookies ARE encrypted. This isn’t an issue of privilege escalation by modifying a cookie, it’s a simple replay attack.

And with respect to another comment - I wouldn’t say that it’s technically a blacklist, it really is a whitelist, but the problem is that it doesn’t fail safe.

A strict parser fails safe. If it can’t parse a tag, it just fails on it and the cruft disappears from subsequent output. This uber-dumb sanitizer can choke on all kinds of invalid input and proceed to ignore it (i.e. leave it the way it is), but the browser, being liberal in what it accepts as Jeff also loves to advocate, will happily try to fix it up and execute whatever badness is inside.

To believe that a few clunky regular expressions would be equally effective is pure geek conceit.

You really want something the equivalent of Perl’s taint-checking on input, but adapted to different classes of data.

This might be a good project to try out the idea Ragenwald (Reg Braithwaite) was kicking around a while back in the context of Haskell/ML strongly-typed languages:

Create distinct derived types of strings for data which comes from various contexts and data which may be put into certain contexts. For example, you have UntrustedString and its derived classes UntrustedHeaderValue and UntrustedFormInput. You have a distinct type family of strings for stuff to store in the DB, DBSafeURIString, DBSafeNoHTMLString and DBSafeValidatedHTMLString, and another family for things which may be output back to the browser, for instance StringWithNoHTML, and URIEncodedURIString, and FormattedHTMLString.

You then make your validation functions return these very specific types, as appropriate for what they do, for instance accepting a UntrustedFormInput and returning a DBSafeNoHTMLString, and you let the compiler help you spot, for instance, that you are taking UntrustedFormInput and trying to directly store it as a DBSafeNoHTMLString, or are using a DBSafeValidatedHTMLSTring in a display function which expects a StringWithNoHTML.

Just saying I’ll HTML-encode all inputs before I store them doesn’t necessarily make anything safer; it’s all context dependent. Maybe you HTML-encoded it but you needed to URI-encode it, or vice-versa. Or maybe you just forgot. This doesn’t help with the specific problem here of just failing to screen some of the cases you need to validate, but in theory it should help. (Never tried it.)

Can’t people edit cookies no matter what? They are all stored in a file somewhere on a computer, so people (especially) in linux, for example, could edit this file through terminal (assuming it’s read only for normal users), and easily edit the cookies.

Loved your post, i mean i really loved it, i had no idea of this vulnerability and just as i was searching why HttpOnly Cookies are even used came across your post. Thumbs up mate. I will surely blog about this too and give credit to you certainly.

Great post. So as I understand, if HttpOnly cookies can’t be accessed by browser scripting, then they’re in fact more safe than any web storage mechanisms (localStorage, sessionStorage etc.). So let’s say I have an authentication token which I want sent with each request. Many resources point out to saving it in localStorage and setting up jQuery’s ajax requests to add the token as an Authorization: Bearer <token> on each ajax request. But that actually makes the cookie susceptible to XSS where as if it were stored in an HttpOnly cookie it would be safer. Is this correct?

But whats the solution when A user can copy cookies of B and use them?

I have a web app which sets the current user email id in the cookie. I use this in subsequent requests. The cookie is encrypted with a secret symmetric key.

B visits my app and now my app sets the cookie in his browser. A gets access to B’s computer and copies cookie value. Now logins via his account and then changes the cookie value that of B. Thats all, now A can access B account on my app.

How do I solve this?

So this would really stop the regullar script kiddie, but not an experienced hacker. Anything you do in client side is essentially unsafe. One could have downloaded chromium sources and compiled it to not also ignore the httpOnly header but also have all the look of google’s chrome.

Also if someone could inject a script on your page is basically because one it’s not over SSL or two you may have accepted an invalid certificate or your computer is hacked and has a fake certificate installed as thrusted. And with a script injected it’s as easy as capturing the input on the login screen.

Web sites should always use SSL and clients should have a good certificate storage inspector that checks for changes and makes shure it’s you accepting it. Then you are secure with or without httponly.
In summary, to me the dangerous part here seems to be the injected script, not it being able to read the cookie.

Uhm, so how about AJAX requests? With technologies like React and Angular, web development is gradually becoming AJAX-ish. Working with HttpOnly cookies can be a real pain in the ass and a quick road to baldness don’t you think?

Yes, restricting cookies from access via JavaScript is definitely a lot trickier today than it was in 2008!