Scaling a Microblogging Service – Part I

When it comes to Twitter, everyone’s a critic.

The irony is, the majority of the technical criticism written about Twitter reveals more about the lack of understanding of the author than anything about Twitter. Creating Nouncer – a developer platform for building microblogs and similar services – provides a firsthand understanding of the inner-working and challenges of microblogging sites. This post is not specifically about Twitter but as the leading microblogging service, it is the best working example of the challenges of scaling in the space.

People don’t seem to get what is so hard about scaling the Twitter service. Some think it has to do with Rails, while others look at the hosting company or other operating factors. Then there are those who incorrectly think a distributed or federated service is the solution (which will only make things worse). The idea that building a large scale web application is trivial or a solved problem is simply ridiculous. In a way it is surprising that there are so few companies out there dealing with commoditizing web developing scaling.

There are two main subsystems from an infrastructure standpoint (regardless of the actual implementation). The first is the distribution (or delivery) system which takes every incoming message and delivers it to the right list of followers via Instant Messaging, SMS, or email. The second is the retrieval (or query) system which allows users to query what the people they are following and the general population (the public timeline) have been up to (few messages at a time).

Building the delivery system isn’t trivial, but much easier to scale. Subscription lists are generally static (who is following who) and tend to be small (even thousands of followers is still a small list). Twitter only allows following all updates of a certain user or tracking all updates of a certain list of words, which means there isn’t any heavy calculation being done at the distribution point as to who should receive which message.

In its current state, pushing updates out isn’t a challenge even at a large scale, and something that has generally scaled well and has showed very solid performance and stability in the past year. If Twitter was a push-only service, they would have no scaling issues at all – but in a way a much more difficult time competing with little-to-none technology barriers to slow competitors.

The retrieval system is where things are not as simple. Unlike webmail services where refreshing a user’s inbox only queries a very simple data set (is there anything new in MY inbox?), refreshing a user’s home page on Twitter queries a much more complex data set (are there any new updates in ALL my friends’ pages?) and the nature of the service means that the ratio of reads to writes is significantly different from most other web services.

It is these constant queries that bring Twitter down during popular events. The fact that a single request for a user’s home page or an API request for a user’s friends timeline must perform a complex iterative query is what causing some requests to take too long, at best timeout and at worst cause the entire system to lag behind. These custom views are expensive and mean that it is much more difficult to cache the final result of a request.

Going through a timeline request, the server first looks up the list of people the user is following. Then for each person, checks if their timeline is public or private, and if private, if the requesting user has the rights to view it. If the user has rights, the last few messages are retrieved, and the same is repeated for each person being followed. When done, all the messages are collected, sorted, and the latest messages are converted from their internal representation to the requested format (JASON, XML, RSS, or ATOM).

As long as all the messages in the system can fit into a single database table, this can be done with a pretty straight-forward query leaving the heavy lifting to the database. But once a single table or even a single server isn’t enough to hold the entire message base (or support the load), an application has to perform multiple database requests to gather the data. Partitioning the database which for many application is enough to offer scalability, solves the issue of a large and inefficient single table scenario, but is also the reason why the system slows down. Sharding takes away the ability to leverage the database indexing services to optimize users’ views. If the people you are following are not all on the same partition, multiple data queries are required.

Caching helps but doesn’t solve the problem because each user has a different view, so even if the latest messages of each user are cached, the aggregated reply has to be constructed for each user. Caching the final reply is wasteful since users don’t really share their views. At the same time, heavy reliance on caching for performance means that it can take many hours for a cache to warm up after a crash or scheduled downtime before the system is back to its faster state.

Caching also raises some questions about the format in which to cache data. Should the data in the cache be in the raw database format or the processed text-based format (such as XML or JSON). Since each application is requesting a different representation, caching the conversion makes sense as formatting is an expensive task, but at the same time duplicates data and by that makes caching less efficient (bigger lookup tables). The site can always partition the cache, keeping a separate cache for raw data, XML data, HTML data, JSON data, RSS data, and Atom data, but beside the huge waste of servers and memory and the complexity in administration, it makes consistency much less attainable and keeping all those individual caches in sync a problem.

One simple but painfully restrictive solution is to duplicate the data for each user. Basically what this means is turning the service into an email system. Each user is given a mailbox and whenever someone they are following publishes a status, it is copied into their inbox, as well as into all the other followers’ inboxes. This brings the solution in line with existing systems such as webmail services where data partitioning alone can accomplish great scaling. But this comes at a price. First the data storage grows significantly faster and requires much more disk space, together with increased backup cost. Second, it makes other actions much more complex such as erasing a message once sent.

This solution can be optimized to duplicate only the internal identifiers of each messages, which means that instead of copying the entire 140 characters (or any other allowed size), only the database key is copied. This reduces the amount of actual content being duplicated, but adds another step of querying the actual content once the list of message identifiers has been received. And it makes smarter filtering more complex, such as requesting all messages from a certain date (since that data isn’t necessarily part of the key). Of course this doesn’t solve the presentation transformation into XML, JSON, etc.

A big problem with this solution isn’t technical but business-wise. If building a scalable microblogging service simply means a heavy investment in servers and storage, and little actual hardcore technology, the only barrier left for competitors is user acquisition. Of course the fine details of the service are important, such as support for images as offered by Pownce or mobile location supported by Jaiku. But those (unless patented) are generally easy to copy. I’m not trivializing the worked required in building the glue even with a duplication-based solution, but that particular challenge has already been solved by many companies.

Eventually a new solution will be needed as the space evolves. Duplication alone doesn’t offer an easy way to get my-friends’-friends view which is an interesting feature. Being able to delegate my list of extended friends to my close friends is something I personally find very useful. The same applies to any kind of smart filtering desired. Being able to create multiple views of timelines based on keywords and other metadata is going to hit a scaling wall. Imagine trying to apply the duplication solution to a web version of the Twitter track feature.

Facebook offers an interesting approach to their friends feed which in a way is a very similar challenge. Facebook offers users abstract controls over what kind of content to show and then provides a feed that is an approximation of what the actual accurate aggregated status really is. What this means is that Facebook is showing a timeline that is good enough but not fully reliable and in the context of a friends feed is good enough. The same cannot really apply to Twitter or other messaging platforms where reliable delivery of messages in order is critical. What my friends are generally up to can be generally accurate.

A similar challenge with different properties is the scaling of social graph data. Facebook currently has more than 7 terabytes of social graph data and other services such as MySpace probably have more. This is the exact problem that caused the early performance problems for Friendster, trying to update the social graph whenever someone added or removed a friend. Calculating distance between friends is a highly desired feature as it allows you to find out if you and someone else have mutual friends. It also drives many of the popular features in most social networks. On LinkedIn it allows you to ask someone you know to introduce you to a potential business contact. But as LinkedIn grew, it started to reduce the distance it pre-calculated between people from 6 degrees to 2-3.

The social web is creating demand for new scaling tools and technologies. Current databases and caching solutions are simply unable to handle a complex network of multiple relationship between objects. While databases are still a good solution for persistent storage of social data, each retrieval requires heavy calculation. These are the exact challenges Nouncer is attempting to solve and commoditize in the microblogging space. In the short run, an “inbox” approach to scaling microblogs might be the logical approach, but in the long run we have some exciting problems to solve.

This post is continued in Part II.

22 thoughts on “Scaling a Microblogging Service – Part I

  1. Thanks for digging in and giving details. I particularly liked your coverage of the pressure point on Twitter and similar services.

  2. but… why not a push only service? what’s the problem with this simple design? :
    1 – maintain update histories as static content (including “me & my friends” histories and any kind of history)
    2 – push any update to the concerned static files.
    why the hell would the user want to see updates from his friend made *before* he actually followed him in *his own* timeline?
    can you please go a bit deeper and explain?

  3. sorry, my apologies, I reacted too quickly on : “If Twitter was a push-only service, they would have no scaling issues at all – but in a way a much more difficult time competing with little-to-none technology barriers to slow competitors”
    I see you responded to my question in the inbox example.

  4. Twitter is likely falling into the trap of optimizing for write speed, rather than optimizing for read speed.
    Let’s break down this paragraph:
    “Going through a timeline request, the server first looks up the list of people the user is following. Then for each person, checks if their timeline is public or private, and if private, if the requesting user has the rights to view it. If the user has rights, the last few messages are retrieved, and the same is repeated for each person being followed. When done, all the messages are collected, sorted, and the latest messages are converted from their internal representation to the requested format (JASON, XML, RSS, or ATOM).”
    First of all, dealing with a push service or not, the server does not need to look up the list of people the user is following. In fact, it doesn’t need to do any of those queries at all. It just needs to know the past X messages sent to me (where X is set by the business; my guess is a value of 15 is perfectly valid for 99% of all Twitter access).
    Take the following scenario: I follow you, you write a message, then I stop following you.
    Push service or no, I was “sent” that message during that time period, and so that message gets put onto my queue. If you block me, or leave the service, or what have you, that message *at that time period* was still in my queue, and so long as that’s within the past X messages, that’s going to show up every time I get my most recent messages.
    This is how more traditional messaging systems work, and why they’re so A) reliable, and B) fast. Instead of writing a query with so many joins against an author table, a permissions table, a “following” table, etc etc, all you need to do is one query against one table, with descending ordering on the primary key.
    The penalty should be on message write. Run your calculations there, don’t run them on message read. Have a transaction queue, submit jobs to the queue and return control to the user quickly. “Thanks for changing your real name!” becomes a large transaction (update all previous messages to set the name differently) but with a queueing system, people will still find the service to be fast.
    This isn’t really a “push” system I’m describing; it’s more of a hybrid system that is, again, optimized for reading the same data over and over and over again.

  5. The problem with this analysis is that it continues to view the entire problem as something that needs to be solved with the traditional database-centric approach used by regular web apps. In this case this is entirely the wrong approach; it’s just that databases are the hammer that makes every problem look like a nail.
    You almost figured it out with your “webmail” idea. Your complaint about duplicating the data is that it uses too much storage. Really? Consider 140 bytes per message x 100 messages to leave in the inbox x 10 million users. That works out to … 140 gigabytes. I have that much storage sitting on my lap as I type this. My $7.95-a-month hosted web account has almost that much storage. That’s hardly a problem. If it runs out, you just go to Fry’s and spend $100 on another hard disk. You can argue about extra overhead for message metadata, and there are probably more than 1e7 users by now, but it’s clear that capacity is not a problem … especially since this solution is extremely scalable, merely by partitioning the user “inboxes” across multiple disks and servers in a cluster. You then route requests for a particular user to the server hosting that user. The servers just need a simple protocol to pass new messages to each other.
    And then of course you can start to pull the cluster apart, running that inter-server protocol across the Internet itself … voila, a distributed solution (one that looks rather a lot like Jabber, actually.)

  6. I don’t get what the difficulty is either. Why don’t caching and SMTP/email techniques make this quite doable?
    I have to think the RoR could very well be a culprit here. There’s nothing about Twitter that makes Rails a suitable platform.

  7. Re-scaling Twitter – some lessons for startups

    I’ve been reading this post on scaling Twitter on the Twitter blog with a growing sense of Deja Vu. They note that:
    Twitter is, fundamentally, a messaging system. Twitter was not architected as a messaging system, however. For expediency’s sake, Twitte

  8. Microblogging Skalierungsprobleme

    In letzter Zeit wird Twitter recht viel abgewatscht, weil der Dienst ziemlich instabil läuft. Das ist zwar weniger wichtig, solange man “nur” private Statements wie “Bin verkatert – erstmal’n Käffchen” schreibt, aber der Dienst wird zunehmend auch für

  9. Great post. I’ve advocated an inbox model for Twitter before, but I’m having second thoughts now. Thanks for the details!
    Jens does have an interesting back of the envelope calculation for storage requirements in an inbox model for Twitter’s architecture. However, leaving only 100 messages in the inbox isn’t a solution. That’s just as bad as the jabber advocates who think that doing the logging locally is just as good as Twitter. Some people have way more than 100 friends on Twitter, and I bet if we could look at the logs we’d see that people page way further back than than 100 messages. Especially if you want to catch up with a couple of days of tweets!
    Let’s put aside the fact that reliable redundant storage for that much data isn’t $100 at Fry’s, I think there’s another problem which is peculiarly Twitter’s. If I have 5000+ mutual friends (like, say) then every time I send a message it has to go to all those inboxes, and every time those people send a message it has to go to mine. Sounds like a mailing list, you say? But if you view my inbox it has to filter it by the privacy setting from each user to you. I’m sure that when users get split across servers then significant duplication will happen, but I’m not sure Twitter will move to a straight-up inbox model any time soon – there are too many people with too many contacts for it to be that simple.

  10. 140 characters should be enough for everybody

    Rails would do itself no harm by conceding that it isn’t a platform that can compete with Java or C when it comes to intensive tasks. Oh boy. This…

  11. Twitter victime de son succès et de son évolution en messagerie instantanée

    Twitter s’est vu imposer le besoin auquel répond son service. De plateforme de publication de (micro) messages, Twitter est devenu un (quasi) service de messagerie instantanée, un virage non prévu par les développeurs et amplifié par le nombre …

  12. As Jens points out, Twitter is hardly more than a webmail with small messages and a more open access policy (i.e. I can see others’ inbox). Webmail at a much larger scale was solved years ago. Whether the message passing protocol is smtp or jabber is then a matter of taste.

  13. What about a new approach ? just remembering the last 20 posts destined for any user ?

    Wouldn’t it reduce the scalability problem ? Request for more history are relatively rare I believe. (Based on a comparison with search engines – most people don’t go beyond the first page of results, some go the page 2, very few to page 3 and virtually none to page 4.)
    It would mean some information would be retained in more versions but the total amount of this redundancy would be quite manageable.
    It’s just an idea…

  14. Twitter probably does use an inbox sort of model; if you unfollow someone or change the settings as to which replies to others you see, the contents of your current timeline doesn’t change (or at least it didn’t last time I tried this).
    During the recent outage / overload Twitter cut down the number of messages kept in an inbox to one page; something like 20-50. Contrary to some of the suggestions above that no one would mind 🙂 there was an enormous outcry among the people I follow (including me), to the point of people leaving for other services…

  15. Thanks for sharing this insight, it’s great to get hands on experience from someone who’s actually been there.
    What about keeping the most recent messages in an “inbox” type format?
    So users are sharded. For each user you store the most recent X messages. In Twitter’s case, say 100, to give 5 pages of data.
    Then for going back through time, or other more complex stuff, you hit the main archive. I’m guessing the main archive could be sharded by time. So one database per month / week / day / hour as appropriate.
    These type of archive operations would be slower. Again, the most recent data could be somewhat “cached”. So every server holding data for the last X hours is replicated X times.
    I’m wondering if this would provide a “good enough” service. My guess is that requests are much more common for recent data. Older data is probably hardly ever touched. Whatever time periods “new” and “old” might be.
    Any thoughts?

  16. I support an inbox model too. Majority of users probably just want to get messages delivered once they start following someone. Allocate so many messages or space per person. Charge a premium rate for additional messages or space. automatically delete old messages that makes the user go over the quota.

Comments are closed.