This isn't really about webservices, but it is about building cool websites. Sometimes developers face that classic "good problem" that their site is more popular than their server(s) can handle. Some of you might have noticed that we ran into this recently issue recently and I'd like to share with you what we did about it.
First, a little background on our server architecture. www.rhapsody.com points to a load balancer that distributes page requests amongst a set of identical app servers. The app servers try to serve content out of their local caches and if the content they need isn't there, they look it up in a shared database. All very standard.
Our particular issues start with the fact that the entire rhapsody catalog is too big to fit into the cache space that our app servers can use. It takes an order of magnitude or two longer to serve pages out of the database than from one of the local caching layers. The result has been that popular content which is in cache gets served pretty fast and less popular content gets served much slower. Adding more app servers to the farm helps to a certain extent, but at some point the database gets maxed out and adding more app servers doesn't help at all or even makes things worse. We could add more read-only database instances and replicate, but there's a better solution.
Load balancers often are configured to use "session affinity" whereby a particular user/browser-session always gets routed to the same server so that server-side session information is preserved. In our case, there is very little user-specific information on our pages -- the catalog content is the vast majority of the pages and also the performance bottleneck. So instead of session affinity we enabled a kind of load balancing that seems obvious, but based on my reading of the literature I don't think is very common. You could call it "content affinity" or I think a more descriptive term is "URL affinity." Instead of binding a session to a particular server, we bind URLs to servers.
This means that every time somebody asks for the hoobastank page, it goes to the same server. So now instead of every server on the farm trying to cache hoobastank's data, we have just one place. (Actually a couple -- I'm simplifying things a bit.) Even better, because of our friendly URL mechanism, it's easy for the load balancer to send all hoobastank traffic to the same server -- their artist pages, album pages, etc. So when we cache the hoobastank object on the app server, we get to use that for all of their pages. Thus each of our app-server caches is responsible for less content, and gets more efficient for the content it is responsible for. Moreover, as we add more servers to the farm, the cache efficiency increases. As our caches get more efficient, database load goes down, as does page load time. Ahh, scalability. It's beautiful.
I'm posting this primarily because I'm hoping some of you will find it useful. I've read a number of books on the subject, and none of the ones I've found talk about this kind of content caching problem. I'm looking forward to reading Cal Henderon's new book on scalable web sites as I bet he's had to address these issues on flickr.
I'm not posting asking for advice on how we should have done things or what's wrong with this system. I understand full-well that this isn't ideal for a lot of reasons. But it works pretty well for what we need right now. A more sophisticated architecture would do essentially the same thing but with a dedicated data caching tier behind the web application tier. That gives you the ability to scale multiple types of content, e.g. catalog content and user-specific content. But we're actually going to move that functionality down to the browser. Stick around and you'll see what I mean.
-Leo Parker Dirac
Recent Comments