System Design, Efficiency & Caching
At my new job we have a web framework. This framework is an MVC amalgam of Smarty and proprietary models and controllers. Aside from my belief that smarty is basically garbage, this framework actually does some cool stuff. For one thing, the database abstraction layer supports results paging pretty cleanly, and there is a nice URL generator for views. The best part is the admin module which uses two meta tables to automatically create an administrator interface for the client.
The purpose of the framework is not to create custom applications quickly, but rather to provide a set of reusable modules that can be plugged in. The existing modules offer a wide range of functionality that meets most basic needs of clients without any serious coding at all. Having a system like this provides a lot of bang for the buck.
Unfortunately the system is high overhead. Not only does every page have to load 5000 lines of code, but certain modules hit the database very hard. The result is that page load times range from snappy to 5 or more seconds on my local development box (1.67ghz G4 at 80% idle). The problem is threefold:
- Excessive database chattiness
- An all-or-nothing approach to data fetching.
- A generalized admin where all tables are treated equally.
As long as everything works we are in good shape. But if the client complains about speed there is not much we can do to optimize a module without a wholesale rewrite and testing. I used to believe in completing functionality and then optimizing as necessary. But as soon as you introduce any algorithm with exponential complexity you are asking for immediate and intractable performance issues.
Once a rewrite is required we have basically lost all benefit of using the pre-written module in the first place. The client has been promised a price based on the assumption that we would have to write 100 lines of code, not 5000. If our optimizations require certain database changes then we are truly screwed, because the generalized admin tool is complex and not friendly for special cases.
The horror is that reasonable website performance is implicit in any contract. You can’t in good faith deliver a website where pages take 5-10 seconds to start downloading on a relatively unloaded server. So now we have 5 times the planned amount of work just to meet the basic agreement.
In this scenario caching is a critical tool to have under one’s belt. I was able to split the difference and get a reasonable product out the door. Thank god there was no dynamic sidebar (like a shopping cart) or other hurdles that could have required deeper changes.
My experience with caching Templation has served me well in this regard. Caching is not to be taken lightly. One false move and the website could be displaying outdated information. Even worse, you could be caching incorrect information if your cache trigger is not fine-grained enough. Clients generally have no patience for this kind of problem, so testing is paramount.
Rails to the Rescue
Looking at the framework we’re using has really made me appreciate Rails all the more. Rails is high-overhead itself, but the overhead is managed by FastCGI. Once the dispatchers are started, the application code is all resident in memory. Beyond that, Rails provides a very well thought out—and extensible—domain specific language. Our framework is nice, but it really can’t compete with the kind of vision that went into Rails. Even worse, it’s written in PHP which just doesn’t have ruby’s dynamic and object-oriented nature to flex.
The real value in our system is the modules, but I think each of those could be ported to Rails in 2-10 hours. In fact I would recommend doing this for new projects except for two reasons:
- It’s much harder to hire Rails talent than PHP, especially in New Mexico.
- Rails apps are resource intensive, and are difficult to deploy in large numbers in a shared hosting environment.
Converting these old crusty behemoths into light Rails code would really be fun though…