YSlow: Yahoo's Problems Are Not Your Problems

“I refuse to take Yahoo’s recommendations seriously until they stop embedding a 43.5KiB stylesheet into their main page.”

Well, duh. If you’d actually bothered to read their recommendations, you would have realized why they do that.

One of the websites I tested (not mine!) had 109 http requests.

Assume I have a 50ms ping to the server (which is fairly average), then that’s 5.4 seconds added to the page load in latency alone.

I believe one of the biggest wins with expires headers is that a fair number of ISP’s use invisible caching proxies on their network. This reduces their bandwith bills as well as improving response time to their customers. For the site it means you send out one request and a few thousand people grab it from someone elses proxy.

I tend to agree with pretty much most of what you said. Interestingly enough I manage to get an A (90) on my site somehow despite not using a CDN or expires headers. (Though I will probably add expire headers to certain content when I get the chance to research it more)

I would just like to point out that you can change these weights. By typing in about:config into the address bar and then filtering by the value ‘yslow’ you can see all the options for yslow. If you are only concerned with the weighted point values, filter by ‘yslow.points’.

Like you said, Yahoo’s problems are not your problems. Perhaps someone should setup a set of weights to change this to. Ie. if you are a small blogger use these values, if you get x number of hits use these.

And as for your comment about inefficient web pages, I do feel there is a large number of inefficient web pages out there, even on high profile sites. Especially corporate sites. Of course taking my experiences from where I work, I can completely understand how it happens.

I refuse to take Yahoo’s recommendations seriously until they stop embedding a 43.5KiB stylesheet into their main page.

(Side note: This figure was checked on June 5, 2007)

unless you like changing filenames every time the content changes.

The default way to output links for resources like CSS in Rails adds a parameter with the modification time automatically. If you modify the file then the link is generated with a different parameter and the browser retrieves the file again. This lets you have the far future expires without having to change the filename each modification.

Good post. Kinda reminds me back in '99 when everyone was trying to create architectures to handle Amazon and ebay-like loads.

Nice writeup.

You know what would be nice…

If the tool could provide an estimate of savings for a specific page, before one goes through the motions of making real changes.

For instance, if you implement gzip compression… your estimated savings is XYZ.

… or better… allowing people to add checks beyond the original 13 items… to customize the entire plugin.

Krgrds,
E. David Zotter

Once again, great post Jeff.

quoth apeinago : “what we need is a bittorrent’esque webbrowser.”

Yes. And then a supernode gets infected and someone finds a way to splice information in the packets going to other people (just add the right padding to end up with the same checksum). Result: you can hack the site without hacking the site and hit-and-run your botnet together, because if someone else checks, the perp will be gone; or the packet will come from another source, so it won’t show.

I’ll trust BT-esque rather for content that is large (so overhead is relatively small) and isn’t expected to change every minute (unlike a well-visited news site).

great to see the posts back to development instead of hardware based!

A lot of this advice is very good, but I believe a lot of you are missing out on what the advice is actually telling you. Yahoo! gave a talk at the Web 2.0 Expo this year in which they gave their 14 rules, and explained briefly why each rule was there, and what it would accomplish.

http://nate.koechley.com/talks/2007/atmedia-london/high-performance-web-sites.pdf

I’d recommend reading this to see where the true benefits come. Some suggestions, such as the CDN, aren’t needed by most people, but things like far future expires headers, and avoiding redirects are VERY helpful and can easily be applied to any user.

Using CSS sprites was also discussed at the Expo during the talk and is another GREAT optimization to cut back on requests to the web servers, which I guarantee you is where most of the user spends his time.

"“I refuse to take Yahoo’s recommendations seriously until they stop embedding a 43.5KiB stylesheet into their main page.”

Well, duh. If you’d actually bothered to read their recommendations, you would have realized why they do that."

Their reasoning is that by merging them into the same file results in faster end-user response times. I found their reasoning flawed. It only applies in two cases:

  1. It’s the first time I visited their site.
  2. I have caching turned off, or the external stylesheet has expired from the cache.

For a repeat visit, my browser would cache that stylesheet, so the next time I visited it, it would send a conditional GET (If-Match if ETags are being used, If-Modified-Since if they aren’t), be passed back a 304 Not Modified header, then load the cached copy.

Skipping sending 43.5KiB is going to speed up page loading except on the fastest of connections.

Powerlord, Yahoo found that 40-60% of visits were NOT repeat visits based on their data.

http://yuiblog.com/blog/2007/01/04/performance-research-part-2/


40-60% of Yahoo!’s users have an empty cache experience and ~20% of all page views are done with an empty cache. To my knowledge, there’s no other research that shows this kind of information. And I don’t know about you, but these results came to us as a big surprise. It says that even if your assets are optimized for maximum caching, there are a significant number of users that will always have an empty cache. This goes back to the earlier point that reducing the number of HTTP requests has the biggest impact on reducing response time. The percentage of users with an empty cache for different web pages may vary, especially for pages with a high number of active (daily) users. However, we found in our study that regardless of usage patterns, the percentage of page views with an empty cache is always ~20%.

Doesn’t mean we should all do it, but food for thought.

Jeff,

Great article. As a webdev at Yahoo, I can speak from experience that these rules definitely can make a huge impact on our performance; also, depending on what site you happen to be working on, a rule might be more or less relevant. For example, if you are expecting 1-2 page views per session, then externalizing your CSS and JS might not be worth the extra HTTP requests, even to a CDN. If you get 10-20, then it’s a huge win, and you should absolutely lean on the browser cache heavily. Others, such as concatenating and minifying scripts and CSS, are almost always beneficial. You’re right to point out that there’s no substitute for a thinking developer.

As a huge web company, we have a lot of different kinds of sites, and they have different requirements. YSlow is not intended to be anything but a lint checker that summarizes our Exceptional Performance Team’s findings. And it is very useful in that regard. It was an internal tool back before it was a Firebug plugin, and I believe it was only recently released shared with the public.

The tool doesn’t include a configuration screen, but if you enter about:config into the address bar, and then filter for “yslow”, you can adjust the weights that are assigned to each rule. Handy when you know that the tool is wrong about your particular situation.

I agree that using a CDN is overkill for all but the largest websites, and shouldn’t really be on Yahoo’s yslow…

This speaks to the title of this post, but, on a high-volume website, not using a CDN for static content is a recipe for disaster. Serving static content from a good CDN can be much faster, since they’re optimized for speed and cacheability rather than hosting an application. When you multiply the number of requests by 3 (html, css, js), you can bring down a bank of servers in the first few million impressions. (Of course, I’ll probably have grandkids before my little podunk blog gets a million impressions, so what matters for Yahoo might not matter for you.) Since Yahoo routinely plugs its own content on The Most Trafficked Site On The Web, we have to build pages that can scale to support massive spikes in traffic. I’ve seen a 2% click-through rate from the home page cause servers to die and flop around with rigor mortis. It’s a MASSIVE firehose of traffic that we deal with.

In typical “open-source good guys who know where their bread is buttered” fashion, Yahoo is simply sharing what we do to optimize pages in situations where optimization counts. They want the internet to be faster. A faster internet means more people will use it more of the time, and that means more people using Yahoo.

Btw, 43.5 KB of CSS embedded in a GZipped HTML document is only about 14.5KB on the wire. It’s absolutely worthwhile for the homepage to embed this information in an inline style tag.

The comment about maintainability only highlights the need for a good build process. Develop with many files, and then concatenate and minify them all as part of the build-and-deploy process.

Hmmm… now I see why it looked like you hadn’t posted for several days. You’ve been messing around with caching evidently. I figured that you were just on vacaction or something. I took the chance and clicked my refresh button and found that you had actually been posting but I hadn’t been getting the changes.

If you’ve been messing around with caching your page content then you need to do some more testing. Some of us thought that you had given up posting or gone on vacaction!

This is Steve Souders, YSlow author and Chief Performance Yahoo!.

Jeff - This article really fills a missing need. I just finished reviewing an article written by a member of the performance team on how these rules apply (or don’t apply) to smaller sites. But you’ve addressed a lot of that here.

Alexander Kirk and I emailed about this blog. He suggested that Rule 13 (ETags) is also not applicable to smaller sites, since they primarily run on one server. He thought Rule 3 (Expires header) was not applicable to smaller sites, because the cost of revving filenames could be a challenge to smaller sites. I think the benefit of browser caching is huge, and feel that the development burden isn’t bad and could be lessened. I show some PHP code in the book to make this easier, and hope that could be published some day.

Rule 2 (Use a CDN) is hard for smaller sites to adopt. I recommend several free CDN services, but don’t have any data on how good/bad they are. Feedback there would be great, such as the info on CoralCDN above. I knew this rule would cause smaller sites to lose points, so I added the config option to add your own CDN hostnames basically disabling the rule (http://developer.yahoo.com/yslow/faq.html#faq_cdn).

Powerlord - One cool technique described in the book is “dynamic inlining”. The first time a user arrives on your page the server inlines the CSS. In the onload event the page downloads the external .css file and sets a cookie. The next time the user goes to the page, the server sees the cookie and inserts a LINK tag to the external file, instead of inlining the CSS. This works for JS, too. This is the best of both worlds - a faster page load on the first (empty cache) page view, and a smaller, faster HTML document on subsequent (primed cache) page views. The cookie doesn’t reflect the state of the cache 100% accurately, but it’s pretty close and can be tightened/loosened by tweaking the expiration date of the cookie. I thought FP was doing this. I’ll go back and ask them.

E. David Zotter - All great ideas. I’m putting them on the list!

I’d prefer if people not alter the YSlow grading system using the about:config settings (other than the one for CDN hosts). This will create a situation, even within the same company, of YSlow grades being apples and oranges. We’ll work on ideas for making YSlow more applicable across different types of web sites.

I agree that the web is full of terribly inefficient web pages. I’m working on a paper that translates these performance improvements to power savings; think of the number of MW that would be saved if everyone used a future Expires header avoided those unnecessary 304 Not Modified validation requests. Each web site has to weigh the costs/benefits before deciding to address these rules, but for the most part the fixes are fairly easy and the benefits are noticeable. As all of us improve our development practices, all our users reap the benefits. I appreciate the huge amount of discussion around this topic and look forward to a faster and more efficient Web.

Hi Steve-- thanks for your comments, it’s always great to have the source of the article stop by! Yslow is a fantastic tool, and it reflects very well on Yahoo! to release something so helpful to the community. Anything that gets this many people talking about ways to improve web performance is a net public good.

However, as noted by Matt’s comment directly above yours, caching is something that you have to be very careful with. I accidentally turned on the Expires/Cache-Control header for ALL my content for about an hour (whoops!) before I realized what I had done. Thus, everyone who was unfortunate enough to visit in that window of time won’t see any changes on the homepage until the cache expires, 7 days from now.

Totally my fault, of course, but I do think this is exactly why IIS defaults to never setting an Expires/Cache-Control field on any content.

Re: The ETags rule and how it’s applicable to smaller sites.

I believe the rule applies to smaller sites. Although they are served from one server, they often move.

A smaller site is usually hosted on a low cost shared server. I can tell from bad experience that often you’re wrong about your choice of hosting provider. Either tricked by a “review” from the provider’s affiliate, or by the promise for unlimited something (bandwidth, space). So after a while, maybe not even a full year, you move your site to a different provider, meaning on a different server, meaning different ETags. Also even if you stick with the same provider, sometimes they decide to move you to a new server because of their internal restructuring or any other reason.

Given how extremely simple it is to configure ETags, why skipping this rule?

I’ve decided to tackle the speed problem in a “social” way (ohh no, another web 2.0 hyper here!).

Here’s my reasoning:

Many websiteds uses the same JS libraries (say prototype, jquery…).

But when every website stores its own version on its own server, there’s a huge lost of caching benefit. Furthermore, some website owners are unaware or unable to tweak their server for full performance .

Imagine a JS distribution site - all versions of popular JS libraries and script will be remotely hosted on a fast tuned server (JS specific CDN you could say).

This could be a win-win situation:

  1. Regulars surfer will just surf faster - No noticable at first, but as a mass of websites uses the aforementioned service the chance of already having a cached version of a JS library will rise significantly.

  2. Website owners will save money on bandwith costs, serving JS file from a remote fast optimized server. Their website usability will get better due to speed improvement.

(*this does not apply to YUI of course.)

Of course, there are drawbacks: control, reliability and security.
I’m doing everything I can to make the service as reliable as possible, including backup servers on different hosts etc…

Well, there it is. The service should be launched in less then a week. I believe ads and sponsoring will cover the hosting costs, and everybody should be happy.

Click on the name to be directed to a sign-up page so you can notified when we launch (or to flame me for self promotion on a respectful blog).