Regular Expression

I was perusing the usual RSS feeds today when I ran across a link to New Yorker article written by Malcolm Gladwell. The article is apparently a pretty lashing review of a book titled Free by an author named Chris Anderson. Since I'm a pretty big fan of Gladwell's work I thought I would head on over and take a gander. Only one problem, the website has been timing out for the past 30 minutes. Thats right, the New Yorker is likely getting more traffic right now then it received all of last month, yet its servers are seemingly balled up in the corner weeping and asking it to stop. Considering the fact that the article is currently is linked to by Slashdot, Techmeme, and Silicon Valley Insider (and probably a ton of literature focused blogs) I can't say I am surprised that they cant handle the traffic, thats a flash crowd and a half.
Its situations like these where having the ability to scale out your website is quite critical. It's not like they didn't know it was coming...they must have seen a decent uptick in traffic when the Silicon Valley Insider article and Techmeme link came out yesterday morning. I'd be very surprised if there wasn't some inclination that things needed to be ramped up.

The cynic would note that flash crowds from Slashdot are notoriously difficult to deal with and that one gets almost no warning. While this is true, the Slashdot article came out over six hours ago. By now they could have gone down to the local Fry's, bought some cheap servers, formatted the hard drives, and had their entire stack up and running if they wanted to. From the looks of a quick whois on their domain it seems like they are hosting their own servers, whoops.

Since it is apparently my new favorite website, I ran a pagetest on their site (sorry servers) and the results go to show that what I was talking about in my last post is pretty damn important. If you take a look at those results you will see a waterfall so big you'll want to swim in it. After 11 seconds and an impressive 123 requests the website should be nice and loaded for everyone to enjoy. Alright, first load performance is hard but at least they are using expire headers right? Ooops. Yep, thats right, the second page load takes a whole 116 requests, and just about all of them are 304's (Not Modified). The fact that its not actually fetching the data makes the second load a little faster (~7 second) but its still pretty slow, and when you are getting tons of traffic having to serve over 100 requests per page load makes a server quit on you much faster then it should. It should be clear that they aren't using multiple asset hosts or any thing fancy like that, but what is the kicker of the whole thing? Not only are they loading a zillion objects, but take a guess where their server is located. Dallas? San Fran? Virginia? Nope, how about the Netherlands. And there is my queue to stop trying to explain what on earth they are thinking...

At some point over the last year I have developed into a speed freak. No, not the kind who hang out in back alleys doing questionable things, the kind that wants to make websites fast. Just in the past year or so I have picked up a ton of relatively simple techniques for 1)figuring it out why a website is slow and 2)improving the performance. I was taking a look at my blog this evening and while I did not feel like it was particularly slow, I knew that there were a lot of things which could be improved so I thought it would be a good exercise. Since it is a relatively simple and closed experiment, it seemed worth while to document here so that others could use it to guide them, so enjoy :)

Lets first start by determining why my blog is slow so that we can focus on what could yield the most improvement. If you have not heard of or read Yahoo!'s "Best Practices For Optimizing Websites" now would be the opportune time to do that. That document has become a Bible of sorts for website optimization, they do a very good job of thoroughly covering all different aspects. One thing to note is that in many cases there are practices which simply don't apply or are not an option in many contexts (e.g. use a CDN), don't worry, that's fine. Its definitely not an all or nothing system, so do as many as are feasible.

Now that we've read all of these best practices, we want to see where our site does not follow them. For the rookie optimizer, you may be surprised to learn that there are actually quite a few utilities available for studying website performance. Not surprisingly, the most popular is a Firefox plugin named YSlow which was developed by the folks at Yahoo!. Not to be out done, Google has created their own Firefox plugin called Page Speed, while the interface and rules are slightly different it is similar to YSlow in more ways then not. While both of these plugins are great for quick checks and such, what you will often notice is that performance can vary greatly from refresh to refresh which can be quite frustrating when you want to find out if change X actually helped or hurt your website. As a result, I use the terribly-named Pagetest, it follows all the same rules as YSlow it simply provides additional useful functionality like running trials multiple times, along with visiting multiple urls, and my favorite the waterfall chart.

So I went ahead and put my website on Pagetest, had it do three runs and then waited for the results. Instead of putting images on my site of the test results, here is a link to them so you can explore along with me, I'll be discussing run #2 since its the closest to the average. To start lets look at the waterfall view. On top it provides a nice little summary of the most important information, so from there we see that the load time is 4.5 seconds. Just for a point of reference we check Alexa to see that Google has an average load time of 0.7 seconds and Yahoo!'s average is 2.6. While my site is quite a bit slower, it is somewhat expected when you consider the amount of time and man power both companies are able to dedicate to optimizing even the smallest of things.

In looking at the waterfall view there are a few important thing to point out, 28 requests are made most of which are images from cs.ucsb.edu (which is where I was hosting the images for my blog). Since that was the first time the page was being loaded all of the images/css/scripts needed to be loaded so this is in effect paying the full penalty since you have to get everything. Ideally, since most of that content is static we could utilize the browser cache such that those objects don't need to be fetched again. We can then take a look at the Repeat View which is the second time the page is loaded, if caching is properly utilized we should see that a lot of the static content is not being reloaded since it is in the browsers cache. In the repeat view we notice that the load time has gone down to 2.4 seconds which is a nice improvement, but that there are yellow lines on most of the requests to cs.ucsb.edu for images. Below in the request details it shows that all of those requests are returning a 304 status which is basically the web server saying that we already have the most up-to-date version of the object, so use the version we already have. While that is definitely faster then before it is important to notice that over 1/2 of the time is wasted making those requests. If you head over the Performance Overview you notice that all of those images have little x's marked in them under Cache Static. What this means is that the server is not setting expire headers on those images. Without an expire header the browser can't tell if the images are still valid when reloading the page, thus it still makes those requests just to check if they are up-to-date only to get 304'd.

Another observation to be made is that even though there are a lot of requests being made, they are not all happening in parallel. Ideally, once the CSS was fetched the browser would know that there are 10 more images to get and start downloading them in parallel. Unfortunately that doesn't happen. The reason is that most browsers are configured to only allow two parallel connections to a given hostname, so even after the css has been fetched and the browser knows it needs to fetch 10 more images from cs.ucsb.edu, it can only grab two at a time. That is a bummer since the images have already been optimized (using smush.it) so they can't get much smaller. There are solutions for this problem which we will discuss below.

Now that we know all of this information it is time to move to phase two which is actually improving the performance. Since I have no control over the server which was hosting the images, I could not configure it to use expire headers and thus a more appropriate solution was needed. While I could host the images on a CDN, they are generally pricey and its not something I'm interested in paying for on a relatively low traffic blog. Instead, I chose to use Picasa, Google's image hosting service. A few of the reasons for choosing Picasa 1) it's free, 2) its fast, 3) I could just login with my Google account instead of needing to create another account somewhere (e.g. Flickr). I grabbed all of the images my site used and then uploaded them to Picasa in a snap. After that I just had to switch all of the images references from the old urls to the new Picasa url. After doing that I reran the performance test and here are the results . Right off the bat we see the load time has gone down to 3.35 seconds on the first load, which is over a 1 second improvement from the original 4.5 second load time. Its important to note that this is not a result of expire headers since that only comes into play on the second page load, something else is happening here. Unlike in the previous case where images were being downloaded from cs.ucsb.edu two at a time, we now see that many more images are being downloaded in parallel as many as 10. If you read Yahoo!'s Best Practices site you will remember the "split components across domains" recommendation which is what is improving the performance. Previously we found that only two connections can be made to a hostname at any given time, but its important to notice that host1.website.com is a different hostname then host2.website.com. It fact can be exploited to allow more then two parallel downloads to occur for a given site. Taking a look at the waterfall chart we see that now the images are spread across lh3.ggpht.com,lh4.ggpht.com, and lh5.ggpht.com (the urls for different Picasa servers). Since the images are spread across these servers the browser can make two connections to each of the three hostnames (six total) at any point in time. A relatively simple and harmless change, but it shaves a whole second off the page load time.

Now we can shift over to the results of the Repeat View to see how well the page performs with a hot cache. The summary shows that the load time dropped to 1.46 seconds, down from 2.4 seconds previously. This is where the expire headers are kicking in. Looking at the waterfall chart we see that only 4 pages are loaded, and this is because the rest of them have been cached, in particular the images. Since the expire header is set there is no longer any need to make the image requests (two at a time) only to get a 304 back from the server which wastes both time and resources. Instead those images are taken from the browser cache which is much faster and as a result the load time is cut down quite a bit.

While the above description was quite long, it was two simple changes that cut down of page load time by 25% and 40% in the first and second load cases. I didn't have to shovel out a bunch of money on a CDN or pay some expensive consultant to do it for me, it was all free and didn't take much time at all. While these to optimizations are by no means the only ones available, they are usually easy to implement and can give you big gains. While I could continue to improve things via more advanced or cumbersome improvements, there is likely going to diminishing returns either in terms of performance/maintainability/cost which makes me less inclined to do it.

Whats the moral of the story? There are easy ways to make your website faster, do it and make your users/viewers happier (one second at a time).

Got Infinite Scalability?

Optimizing Blog Performance For Fun and Profit

Browser Knowledge

Search

Blog Archive

Labels