Redefining Discovery

(Or, What’s Broken?)

CrossroadI agree, I’ve got some explaining to do. For the most part of last year, I have worked hard to sell the idea of basing the majority of our work (OAuth, OpenID, OpenSocial, Portable Contacts, etc.) on a common discovery framework that was directly based on OpenID. We called it XRDS-Simple. And then for the past couple of months, I have assumed the role of a traveling salesman trying to sell something new, Link-based discovery and XRD.

Forgot what discovery is? From the Beginner’s Guide to Discovery:

Discovery, simply put, is the process in which machines learn about how to interact with other machines. When discussing discovery, most people immediately think of a world in which servers find new unknown services and they all just connect and work – like magic. This might be where we are trying to go but it is not what the discussion around discovery is about. Discovery does not answer the request ‘teach me how to talk to you’ but instead ‘which of the languages I know do you understand?’.

Time to explain this change of heart, so let’s begin.

A little bit of history will help set the stage. The XRDS-Simple spec had a nice short recap:

The XRDS discovery document format originated at the OASIS XRI (Extensible Resource Identifier) Technical Committee working in conjunction with the early OpenID community. The acronym XRDS – Extensible Resource Descriptor Sequence – was coined out of discussions between XRI TC members and OpenID developers at first Internet Identity Workshop held in Berkeley, CA in October 2005.

OpenID Authentication 1.1 needed an HTTP(S)-based service discovery mechanism for URLs, and the XRI TC had already defined an HTTP(S)-based resolution mechanism and a general-purpose service discovery format for XRIs (a new type of abstract structured identifier). With a few changes, a subset of the XRDS functionality defined by the XRI Resolution specification would work with both URLs and XRIs. This subset was formalized as the Yadis specification published by in March 2006, and Yadis subsequently became the service discovery format for OpenID 1.1.

A single discovery service for both URLs and XRIs proved to be so popular that in November 2007 the XRI Resolution 2.0 specification formally added the Yadis method for URL-based XRDS discovery (Section 6). The OpenID Authentication 2.0 specification released in December 2007 referenced this updated specification. Since other applications and protocols needed only a subset of XRDS functionality, work begin on formally defining this subset as XRDS-Simple.

When XRDS-Simple first came to be, I declared it as nothing more than a clarification. An interoperability revision or profile. When putting it in context I asked out loud, “Can we change it?” and answered:

Generally, No. Changing the way XRDS-Simple works means moving away from XRDS. That means breaking at least one of the rules above: it must be deployed. The explanation to most of the questions raised so far regarding XRDS-Simple valid but from a process perspective, should have been asked over a year ago when all this was being created but OpenID, XRI TC, and Yadis.

The answer to most of these questions is: “because that is how XRDS/Yadis does it”. We picked an existing standard and (hopefully) made it easier to use and implement by limiting its scope and functionality. I am sure this is not a satisfactory answer for many people, but the same can be said and asked about any other *existing* solution we might have picked.

But as I struggled to publish a second draft, I realized that the underlying problem with my writer’s block was that the framework was fundamentally flawed. I could not find a way to explain how to apply it to the common use cases. It was not designed for what we were trying to do (four or so years after it was conceived). So I asked myself a new question, if I would to design a brand new discovery framework today, what would it look like?

Before I answer that question, I owe you an explanation.

I have yet to justify my claim that XRDS, Yadis, and XRDS-Simple are “fundamentally flawed”. The current framework is used almost exclusively by OpenID Authentication 2.0 which was the main force in bringing together the XRDS format and the idea of an HTTP-based workflow. And since OpenID seems to be working (something I will argue against in an upcoming post), the burden is on me to show how it is broken.

I must preface my criticism with admiration to the individuals who authored Yadis and XRDS. While I am about to argue that they got it all wrong, everything I am offering now is an incremental upgrade to their visionary leap.

Since talking about XRDS, Yadis, OpenID discovery, XRDS-Simple, etc. can be tedious and confusing, I will bunch all these together and simply talk about them as XRDS. I’ll need you to pay close attention because later on I am going to introduce the new grandchild of XRDS: XRD (i.e. drop the ‘S’). So for now just remember that XRDS is the old, and XRD is the new.

How Does XRDS work today?

XRDS is trying to answer two questions:

  1. Given a resource (identified by a URI), where can I find information about it?
  2. What format is this ‘information about’ in? How do I make sense of it and use it?

The way XRDS process goes something like this:

The client uses HTTP to get a representation of the URI (using an HTTP GET or HEAD request). Within that representation (using the HTML <META> element) or external to it (using the X-XRDS-Location header), it finds where this ‘information about’ is located. A shortcut is offered to go from the URI directly to the ‘information about’ using content negotiation via the HTTP Accept request header (in which the “XRDS representation” of the resource is requested directly). This ‘information about’ of course, is an XRDS document.

Once the XRDS location (and document itself) is obtained, the client performs Service Endpoint Selection (SEP). It parses the document looking for <Service> elements with some desired or recognized <Type> element values. This is all explained in detail in the XRDS-Simple draft (section 5.1).

For example, OpenID 2.0 uses the following service type as its identifier:

When a user signs into a website using OpenID, the user enters his OpenID identifier into the login box. That identifier is a URI:

The website performs discovery on the URI as described above, by making an HTTP GET request to the OpenID URI:

GET /joe HTTP/1.1

To which the server replies with:

HTTP/1.1 200 OK
Content-Type: text/html; charset=UTF-8&lt;!DOCTYPE html&gt;

        &lt;title&gt;Joe's Blog&lt;/title&gt;
        &lt;meta http-equiv=&quot;X-XRDS-Location&quot;
              content=&quot;; /&gt;
        &lt;p&gt;Welcome to my blog!&lt;/p&gt;

In this example, the server provides the location of the XRDS document both in the HTTP header using the X-XRDS-Location header and in the HTML representation using the <META> element. By making an HTTP GET request to the provided location:

GET /joe/xrds HTTP/1.1
Accept: application/xrds+xml

The site gets the following XRDS document:


Which includes information about two associated services, an OpenID 2.0 sign-on provider, and some other service. The site looks for the OpenID type identifier which points it to the URI of the OpenID provider. From there the site is able to proceed with the rest of the OpenID flow. Success.

So, What’s Broken?

The main issues with the first part of the discovery process, obtaining the XRDS document, are:

  • XRDS uses a custom HTTP header (X-XRDS-Location) that is specific to XRDS. It does not scale to other formats, nor does it allow for an easy way to provide multiple XRDS documents for a single resource. It also uses the experimental X- prefix which is not supposed to “leave the lab”. When it was created, there wasn’t any good alternative but today we have a new standard (pending) HTTP header called Link which does everything the custom header does and more.
  • XRDS requires adding either a header or body element, but does not take in account that for the most part, discovery requests are a tiny fraction of the requests for the resource. Think about sites like Yahoo! and Google, where requests for the home page almost never need this extra information, and while the header or element are pretty small, they still add significant waste.
  • The alternative mechanism offered to directly retrieve the XRDS document using content negotiation is somewhat questionable. First, it is a stretch to define an XRDS document as just another representation of the same resource. Data and metadata are two separate resources. But there are also significant deployment issues with this approach, as well as limitations (you can’t provider meta-metadata, or use the format for other purposes on its own).

A more detailed analysis of requirements for such a process and the possible solutions is the focus of Discovery and HTTP.

The second part, the XRDS document format is where the more problematic issues are:

  • XRDS is complicated. Really complicated. It has elements such as <Ref> and <Redirect> which are hard to understand, not to mention implement, and both carry security-related consequences (when ignored). XRDS-Simple was an attempt to profile XRDS and limit it to a smaller subset, but the basic framework is just too complex and offers richness that isn’t really necessary. Yadis made things worse by giving some of the XRDS elements a different or incomplete semantic meaning. At this point, most (if not all) OpenID implementations will break if someone used any of the XRDS elements not directly mentioned by Yadis.
  • XRDS included the <Service> element to describe related services and resources, but did not provide direct means to describe the resource itself. Some use cases used the <Service> element with some special type (or lack of) to provide information about the resource itself but that design proved to confuse most people.
  • If you examine the OpenID example above, it is the end user who is declaring the version of OpenID the provider is supporting. If the provider added support for a new version, the site will never find out about it until the user manually updates his XRDS document. This gets much worse when it comes to extensions. The user will have to constantly keep track of the extensions supported by the provider and add them to the XRDS document. And of course, the user can make mistakes, or simply fail to update when a version or extension is no longer supported. The site will simply try and fail.

When trying to apply the current design of XRDS to OAuth Discovery, I could not come up with a consistent model that made sense. XRDS was used to bundle together a bunch of endpoints and configuration but has failed to actually describe a resource (which is what it meant to do). It uses multiple XRD elements within a single XRDS which is costly to parse, and requires adding other mechanisms such as XML ids. It ain’t pretty.

In other words, the fact that explaining XRDS proved to be so tricky, and that even the people intimately familiar with it have been struggling to apply it to new (but very common) use cases, was by itself a reason enough to reconsider.

What’s Next?

The world has evolved.

We now have much better tools to build a discovery framework, one that is consistent across multiple protocols and communities. The reason why it is taking so long to simply correct the above issues is because I refuse to do that in a vacuum. The world outside our little community is overwhelmingly bigger and there is plenty of knowledge and experience to apply.

The thing about discovery is that any service can (and usually does) create one just for its own needs. Building a uniform framework can offer significant benefits, but if we don’t do it in an open and inclusive way, people will continue to make up their own solutions.

But we are getting close. Over the past few months I have been bringing together people from multiple communities, organizations, and specialty areas. I have been talking to individuals and companies about their use cases and requirements, and things are quickly converging into a uniform discovery framework.

More on that coming next.