There used to be a big difference between API access and regular human-oriented HTML access: the speed in which requests are made. When a request is made via a browser, there is inherit delay from human interaction, browser response, page rendering, and fetching of images. Most of this is gone once a machine makes the call. However, with recent improvement in browser technology and the wide use of AJAX techniques on the client side, even the human-readable pages can make API calls to render pages.
Scalability plays a central role when designing the ways in which data can be requested from a service, be it via an API call or HTML page request. Both types fetch raw data, process it, and then format it into a presentation format such as HTML, XML, JSON, etc. In the case of a server-rendered HTML page, all the different requests are made internally, hidden from the user, and a single page is returned. If the page uses AJAX scripts, the browser makes multiple API calls to fetch individual data sets, but the server still has to fetch the raw data, process it, and format it. It is the size of the batch that makes the difference.
Making an API call is like asking a question and if the question is simple enough, the answer is easy to come by. For example, asking to describe a single user is simple. A single database lookup using a single unique key is fast and generally easy to scale. This is the most optimized use case in relational databases. However, asking for the last 20 messages of all the friends I am following is not a simple question. It is complex because it is really a list of questions batched together. To find the last 20 messages, we must ask:
- Who do I follow?
- What are their last 20 messages?
- Of all the messages found, which are the most recent 20?
- What is the content and other metadata for each of these 20 messages?
This list of questions becomes pretty inefficient with a very large set of friends being followed. However, the data set can be reduced by first finding the top 20 friends with the most recent updates, and only requesting their 20 latest messages. Even with this optimization, we are looking at fetching up to 400 messages in order to display 20. This gets much worse when the request is for the second page of messages (messages 21-40) where we can fetch up to 1600 messages from up to 40 friends in order to display 20 messages. By the 5th page we are fetching up to 10,000 messages to display 20 (which might explain why Twitter temporarily disabled the paging feature of their API).
All the popular microblogging services offers a web interface to request this personalized timeline, as well as an API call. They all let you ask the direct question: what is the latest with my friends? But that comes at a significant performance hit. In theory, there should not be any difference in the amount of resources needed to fulfill this request. At the end all the sub-questions must be asked and if you only ask this once, the server side batch solution will be faster. But when the client starts asking this once a minute or more repeatedly, resources go to waste and scalability is more expensive.
What makes breaking this single request into multiple smaller ones more efficient is the fact the client can be state-full and store information in between calls. For example, the client can keep track of the list of friends, can remember the last message id retrieved, and can fetch all 5 pages at once and page them locally. If the client keeps track of its last request, it can refresh its status using simple questions, such as: has anything changed for this user since message id 4839? To which the server can reply quickly. It also makes usability changes easier to make, for example allowing to temporarily hide verbose friends.
A big part of Twitter’s success came from opening up their infrastructure via a dead-simple API. For the most part, the API design was a result of converting existing HTML pages into machine-readable representation of the same data. The rest was driven by their active developers community. To a large degree, providing this super-simple API is one of Twitter’s biggest scaling challenges, and is a pattern repeated by all the other providers in the space.
Offering simple API is important, and in no way is this post promoting the idea of sacrificing easy-of-use. But developers can be motivated to build better applications using a somewhat more complex set of requests instead of the easy but very expensive ones. For example, Twitter can enforce stricter limitations on the /friends_timeline API call, but allow a much higher quota for the other more restrictive calls such as asking if a specific user has new updates since a given time or id. But even more importantly, services should change their websites to use their API with some client-side scripts, similar to how Google Reader works. Not only will the user experience improve, but it will give a live demo of how to use the API (as well as unify the server platform to a single method of access).
Once the server is only serving data with limited scope, usually providing the messages of a single person with the optional perspective of the reader (to enforce access control), scaling becomes a much easier task as data can be segmented easier. In a world where APIs are becoming a necessity, developers should make sure their platform not only allows API access, but supports the behavior patterns it dictates. Finding the perfect balance between a short learning curve and a highly scalable platform is key to long term success.
