The need for speed

Posted by: Jens | May 9, 2012 | Leave a Comment

For years many developers have paid little or no attention to page load times and optimization. Network speeds kept constantly increasing and sites became more complex which means more code got loaded etc. This trend finally started to change a couple of years ago. One reason could be seen in the rise of mobile internet which is still not as fast as cable connections and has a different pricing model attached to it (e.g. data plans that charge for transmitted data). It suddenly became important to reduce unnecessary bloated code and become more efficient. Moreover, speed is now seen as an important usability factor for users and has implications for conversions and to some extend on Search Engine Optimization.

Reason enough to find out how UBC’s websites are performing. There are plenty of tools out there to check performance:

Which loads faster: just a site to directly compare two websites in terms of page load speed
YSlow: a tool providing best practice tips from Yahoo
Pagespeed: a similar tool from Google, offers an API and Apache Module
ShowSlow: an open-source tool that monitors several performance metrics over time
Web Page Test: a tool to run a page load test
Speed of the Web

With all the tools at hand, I just needed to get a full list of UBC websites. This turned out to be more problematic than I thought. Without a source of this data available to me, I had to find another solution. My first approach consisted of building my own crawler which I abandoned in favour of using an existing tool: Xenu Link Sleuth. After a few hours of random crawling, I had collected more than 600 subdomains in the UBC domain space. For now this has to do as a sample…

The next step was to run a couple of tests. I first planned to use Google AppScript based on this article. That was a very slow approach though and did not meet my needs. Then I experimented with running my own Show Slow instance which turned out not to work either due to a lack of customization options. Hence, I then went back to the actual APIs for Pagespeed and Web Page Test and coded the test myself quickly. This will give me two metrics, one based on best practice performance tips and one on actual page speed.

This got me thinking: if I can measure performance, the next logical step would be to rank sites based on their performance. This is where it got tricky. The initial results from my script showed some interesting insights. The best performers according to Pagespeed did not necessarily load very fast. That does not sound intuitive at first, but can easily be explained. Performance depends on the whole software stack and can be optimized on multiple levels, e.g. database query caching, APC or compression on the server level, clean code on the code level and so on. Tools such as Pagespeed can only measure a subset of these influencers. Hence, it is totally possible to have an optimized code structure which might still not deliver great results if the rest of the stack is not optimized very well. Moreover, I realized that the results from Pagespeed vary significantly. The rules that get checked differ a lot between the different versions and the results are influenced whether you are using a browser-based or the online-based tool and by the browser you are using. For example, the online pagespeed result for the Computer Science website reports a score of 90, but my browser plugin reports a score of 70. For IRES the online version reported a score of 7, but the browser version reported 52. According to Google this might happen if the online version does not get all the data it needs and has to skip certain rules or different content is served to Googlebot (which would be a really bad idea) than to a browser. However, I even noticed differences between the JSON result set and HTML results page of the online version which I simply cannot explain.

If Pagespeed won’t work to rank pages, how about comparing actual load times and rank sites accordingly (based on webpagetest). Well, page load times are highly dependent on complexity and purpose of sites. Would it make sense to compare a page such as Google’s search interface with an image gallery page? No, it would not necessarily. To come up with a meaningful comparison, we would need some way to measure complexity of a site. A first thought might be to use the amount of data that gets transferred and to produce a ratio of load time per 10 kb or similar. But even this has problems because load time is not a linear function of data size. Other components, such as http requests, play an important role. A simple example is that it takes longer to load 100 images of 1kb each compared to 1 image of 100kb. Ultimately, developers and designers should use sprites though and minify and aggregate CSS and JS files. Hypothetically, one could argue that each site could just be using one CSS file, one JS file and one sprited image file (neglecting any external libraries that might get loaded from sites such as Google or Twitter). And even if we would be able to come up with such a metric, it would still depend on the users and what they are prepared to accept. Users will most likely not wait long for a Google search interface to load, but in other use cases users might very well be prepared to wait for a longer time because the expectations might be different, e.g. high quality images on a photo sharing site. It is interesting to note that the value of 3 seconds gets mentioned often. Not sure where this particular value comes from, but keep it in mind as a target and compare it to the actual results we will see later.

Although I will not be able to produce such a performance ranking, I believe that it will be helpful to see the broad range of how sites perform and hopefully trigger a discussion about what could be considered acceptable for a given complexity and what we should improve. Incorporating tools such as PageSpeed and YSlow deliver easy mechanisms to optimize pages without much effort. And these tools can easily highlight common issues. For example, one page took longer than 39 seconds to load. A quick look at the detailed webpagetest result revealed that a single image caused the delay because it had not been optimized for web usage correctly. Many sites actually have basic errors on the page, e.g. references to non-existent content, in particular images. This is simple to fix and can reduce unnecessary http requests. All these quick fixes can help to make the UBC web experience better.

Results

All tests were run with these settings: USA – Dulles, VA – IE8 – DSL
See all the test results

Comments

You must be logged in to post a comment.

Search for:
Categories
UBC Colleagues
- Data Analysis using Open Source Tools

Recent Posts

Archives

Recent Comments

May

9

The need for speed

Results

Comments

Categories

UBC Colleagues