I love UBC IT VMs also Caching because you’re not Caching everything.

Students are definitely back on campus! The Sub is packed there are line ups for coffee at all the Starbucks and some of our web servers have increased traffic 10x fold!

10x fold? This is related to the CMS WordPress install we help maintain/develop with Public Affairs one site on that service is the reason for the traffic spike basically all logins/logouts to the campus LMS is directed through that site otherwise it would have been more or less business as usual for that service.

We did plan for this and load tested and we thought it could handle it but when we were hit with a huge flood of traffic because of a Vista outage we were crushed CPU was maxed out. *This is the reason why I love UBC IT VMs imagine being in this scenario with a locked in piece of hardware you would be hooped big time. We were able to contact UBC IT Infrastructure and they added a CPU literally on the fly! Do you know how cool that is? It’s very cool. Chris dropped the CPU in and the load came down the site was still sluggish but functional. VMs rock!

BUT… this was still very surprising to me to see it crumble on Tuesday. I have had a site hit by front page Digg and Reddit traffic which received considerably more traffic on a 7 dollar per month Webhost account back in 2006 and did not crash! I was simply using wp-cache 2 (before Super-Cache) a Cutline theme and one plugin! No APC no Super-Cache no MySQL Query Cache! It held up on shared host with hundreds of other users. I was for lack of a better word PO’d at our fail whale. To remedy we looked at all our code we disabled all plugins that were not being used cleaned up some code, made some tweaks to PHP and it seemed to help. The following night during the next big spike (which had even more users) it responded OK but still not as fast as I would like it to be.

Why did this fail? We did load testing right? I believe it is because our CLF theme/plugins are considerably more heavy than when we tested (there were a lot of code changes over the month). During the post crash testing we tested disabling and enabling widgets and showed you can 2x the number of page requests your page can handle which is significant (I am not sure what widgets were enabled during the initial test the site was still being actively developed). Another culprit is Domain Mapping which is used on most of the websites on the service. Domain Mapping runs each time a page is hit even when the page is cached via Super-CacheĀ . Luckily we use MySQL Query cache so most of these would have been served from memory so it doesn’t change page requests/second by that much. But this really shows how much of a hog for resources WordPress can be if you let it and why you should do all the little things:

mysql> SHOW STATUS LIKE 'Qcache%';
| Variable_name           | Value    |
| Qcache_free_blocks      | 11524    | 
| Qcache_free_memory      | 36261544 | 
| Qcache_hits             | 50840927 | 
| Qcache_inserts          | 2040382  | 
| Qcache_lowmem_prunes    | 113918   | 
| Qcache_not_cached       | 1857166  | 
| Qcache_queries_in_cache | 24673    | 
| Qcache_total_blocks     | 64344    | 
8 rows in set (0.00 sec)

I hope we think more critically about features we add to higher traffic pages in the future and think more new school in web development. New school as in performance matters and little changes on the app level front end / back end code can help more than old school adding a whack of hardware (ironic because we are doing/did this… but I argue that is so we can do upgrades with less downtime not as much performance).

*I think FB, Google, Yahoo! really promote the new school style well I would say they started it. FB basically re-tooled PHP and contributed to APC this probably saves them tons of $cash by not adding a boatload of new servers as well as make their users happy with faster response times. Yahoo! does a great job promoting front end performance the same with Google’s Let’s make the Web Faster. I like the way they think.