When it comes to Twitter, everyone’s a critic.
The irony is, the majority of the technical criticism written about Twitter reveals more about the lack of understanding of the author than anything about Twitter. Creating Nouncer – a developer platform for building microblogs and similar services – provides a firsthand understanding of the inner-working and challenges of microblogging sites. This post is not specifically about Twitter but as the leading microblogging service, it is the best working example of the challenges of scaling in the space.
People don’t seem to get what is so hard about scaling the Twitter service. Some think it has to do with Rails, while others look at the hosting company or other operating factors. Then there are those who incorrectly think a distributed or federated service is the solution (which will only make things worse). The idea that building a large scale web application is trivial or a solved problem is simply ridiculous. In a way it is surprising that there are so few companies out there dealing with commoditizing web developing scaling.
There are two main subsystems from an infrastructure standpoint (regardless of the actual implementation). The first is the distribution (or delivery) system which takes every incoming message and delivers it to the right list of followers via Instant Messaging, SMS, or email. The second is the retrieval (or query) system which allows users to query what the people they are following and the general population (the public timeline) have been up to (few messages at a time).
Building the delivery system isn’t trivial, but much easier to scale. Subscription lists are generally static (who is following who) and tend to be small (even thousands of followers is still a small list). Twitter only allows following all updates of a certain user or tracking all updates of a certain list of words, which means there isn’t any heavy calculation being done at the distribution point as to who should receive which message.
In its current state, pushing updates out isn’t a challenge even at a large scale, and something that has generally scaled well and has showed very solid performance and stability in the past year. If Twitter was a push-only service, they would have no scaling issues at all – but in a way a much more difficult time competing with little-to-none technology barriers to slow competitors.
The retrieval system is where things are not as simple. Unlike webmail services where refreshing a user’s inbox only queries a very simple data set (is there anything new in MY inbox?), refreshing a user’s home page on Twitter queries a much more complex data set (are there any new updates in ALL my friends’ pages?) and the nature of the service means that the ratio of reads to writes is significantly different from most other web services.
It is these constant queries that bring Twitter down during popular events. The fact that a single request for a user’s home page or an API request for a user’s friends timeline must perform a complex iterative query is what causing some requests to take too long, at best timeout and at worst cause the entire system to lag behind. These custom views are expensive and mean that it is much more difficult to cache the final result of a request.
Going through a timeline request, the server first looks up the list of people the user is following. Then for each person, checks if their timeline is public or private, and if private, if the requesting user has the rights to view it. If the user has rights, the last few messages are retrieved, and the same is repeated for each person being followed. When done, all the messages are collected, sorted, and the latest messages are converted from their internal representation to the requested format (JASON, XML, RSS, or ATOM).
As long as all the messages in the system can fit into a single database table, this can be done with a pretty straight-forward query leaving the heavy lifting to the database. But once a single table or even a single server isn’t enough to hold the entire message base (or support the load), an application has to perform multiple database requests to gather the data. Partitioning the database which for many application is enough to offer scalability, solves the issue of a large and inefficient single table scenario, but is also the reason why the system slows down. Sharding takes away the ability to leverage the database indexing services to optimize users’ views. If the people you are following are not all on the same partition, multiple data queries are required.
Caching helps but doesn’t solve the problem because each user has a different view, so even if the latest messages of each user are cached, the aggregated reply has to be constructed for each user. Caching the final reply is wasteful since users don’t really share their views. At the same time, heavy reliance on caching for performance means that it can take many hours for a cache to warm up after a crash or scheduled downtime before the system is back to its faster state.
Caching also raises some questions about the format in which to cache data. Should the data in the cache be in the raw database format or the processed text-based format (such as XML or JSON). Since each application is requesting a different representation, caching the conversion makes sense as formatting is an expensive task, but at the same time duplicates data and by that makes caching less efficient (bigger lookup tables). The site can always partition the cache, keeping a separate cache for raw data, XML data, HTML data, JSON data, RSS data, and Atom data, but beside the huge waste of servers and memory and the complexity in administration, it makes consistency much less attainable and keeping all those individual caches in sync a problem.
One simple but painfully restrictive solution is to duplicate the data for each user. Basically what this means is turning the service into an email system. Each user is given a mailbox and whenever someone they are following publishes a status, it is copied into their inbox, as well as into all the other followers’ inboxes. This brings the solution in line with existing systems such as webmail services where data partitioning alone can accomplish great scaling. But this comes at a price. First the data storage grows significantly faster and requires much more disk space, together with increased backup cost. Second, it makes other actions much more complex such as erasing a message once sent.
This solution can be optimized to duplicate only the internal identifiers of each messages, which means that instead of copying the entire 140 characters (or any other allowed size), only the database key is copied. This reduces the amount of actual content being duplicated, but adds another step of querying the actual content once the list of message identifiers has been received. And it makes smarter filtering more complex, such as requesting all messages from a certain date (since that data isn’t necessarily part of the key). Of course this doesn’t solve the presentation transformation into XML, JSON, etc.
A big problem with this solution isn’t technical but business-wise. If building a scalable microblogging service simply means a heavy investment in servers and storage, and little actual hardcore technology, the only barrier left for competitors is user acquisition. Of course the fine details of the service are important, such as support for images as offered by Pownce or mobile location supported by Jaiku. But those (unless patented) are generally easy to copy. I’m not trivializing the worked required in building the glue even with a duplication-based solution, but that particular challenge has already been solved by many companies.
Eventually a new solution will be needed as the space evolves. Duplication alone doesn’t offer an easy way to get my-friends’-friends view which is an interesting feature. Being able to delegate my list of extended friends to my close friends is something I personally find very useful. The same applies to any kind of smart filtering desired. Being able to create multiple views of timelines based on keywords and other metadata is going to hit a scaling wall. Imagine trying to apply the duplication solution to a web version of the Twitter track feature.
Facebook offers an interesting approach to their friends feed which in a way is a very similar challenge. Facebook offers users abstract controls over what kind of content to show and then provides a feed that is an approximation of what the actual accurate aggregated status really is. What this means is that Facebook is showing a timeline that is good enough but not fully reliable and in the context of a friends feed is good enough. The same cannot really apply to Twitter or other messaging platforms where reliable delivery of messages in order is critical. What my friends are generally up to can be generally accurate.
A similar challenge with different properties is the scaling of social graph data. Facebook currently has more than 7 terabytes of social graph data and other services such as MySpace probably have more. This is the exact problem that caused the early performance problems for Friendster, trying to update the social graph whenever someone added or removed a friend. Calculating distance between friends is a highly desired feature as it allows you to find out if you and someone else have mutual friends. It also drives many of the popular features in most social networks. On LinkedIn it allows you to ask someone you know to introduce you to a potential business contact. But as LinkedIn grew, it started to reduce the distance it pre-calculated between people from 6 degrees to 2-3.
The social web is creating demand for new scaling tools and technologies. Current databases and caching solutions are simply unable to handle a complex network of multiple relationship between objects. While databases are still a good solution for persistent storage of social data, each retrieval requires heavy calculation. These are the exact challenges Nouncer is attempting to solve and commoditize in the microblogging space. In the short run, an “inbox” approach to scaling microblogs might be the logical approach, but in the long run we have some exciting problems to solve.
This post is continued in Part II.