Summary

BYU is designing an API that reflects the true business processes of the university and its fundamental resources. This post describes why and how we're doing that.

BYU Logo Big

BYU has been in the API and Web Services game for a long time. Kelly Flanagan, BYU's CIO, started promoting the idea to his team almost 10 years ago. The result? BYU has over 900 services in its Web Services registry. Some are small and some are big, but almost everything has a Web service of some kind.

Of course, this is both good news and bad news. Having services for everything is great. But a lot of them are quite tightly coupled to the underlying backend system that gives rise to them. On top of that, the same entity, say a student, will have different identifiers depending on which service you use. A developer writing an application that touches students will have to deal with multiple URL formats, identifiers, data formats, and error messages.

We're aiming to fix that by designing and implementing a University API. The idea is simple: identify the fundamental resources that make up the business of the university and design a single, consistent API around them. A facade layer will sit between the University API and the underlying systems to do the translation and deal with format, identifier, and other issues.

The name "API" reflects an important shift in how we view providing services. When you're providing a service, it's easy to fall into the trap of collecting an ad hoc mish-mash of service endpoints and thinking you're done. The "I" in API is for "interface." When you're providing an interface to the university, not just a collection of services, your mindset shifts. Specifically, in designing the University API we're aiming for something with the following properties:

  • Business-oriented—the API should be understandable to people who understand how a university works without having to understand anything about the underlying implementation. Many more people know how a university works than could ever know about the underlying implementation. An API based on resources familiar to anyone who understands a University makes the API useful even to non-programers.
  • Consistent—a developer should see a consistent pattern in URL formats, identifiers, data formats, and error messages. Consistency allows developers to anticipate how the API will work, even when they're working with a new resource.
  • Completeness—over time the University API ought to be an interface to every thing at the University that works via software (which is to say everything).
  • Obvious—using the API should be obvious to anyone who understands the general principles without needing to rely excessively on documentation.
  • Discoverable—a program should be able to discover allowed state transitions, query parameters, and so on to the extent possible.
  • Long-lived—An API is like a programming language in that it is a notation, not a technology. The goal is to create something that is not only intuitive, but stands the test of time. Designing for long-term use is more difficult than designing for short-term efficiency

The fundamental business of the university doesn't change rapidly. BYU has had students, classes, and instructors for 140 years. Likely, instructors will still be teaching classes to students in 20 years. The API to a university ought to reflect that stability. This doesn't mean it won't change, but ideally the University API will evolve over a period of decades in the same way a language does. Perl 5 is quite different and much more useful than Perl 2, for example, but it's still Perl. This gives the University API an importance that an ad hoc collection of services would be hard pressed to meet.

Building a University API has multiple advantages:

  1. First, and most obviously, making the API consistent and understandable will make it easier for developers to use it in building applications. This includes developers in the Office of Information Technology, BYU's central IT department, as well as developers in other units around campus. Further, there's no reason that students and others shouldn't be able to use the APIs, where authorized (see below), to create new services and GUIs on top of the standard university systems. The University API is the heart of a great university developer program.
  2. Beyond making it easy for developers, a consistent University API eases the pain of changing out underlying systems by introducing a layer of indirection. Once a University API is in place, underlying parts of the system can be changed out and the facade layer adjusted so that the API presented to developers doesn't change, or, more likely, only changes in response to new features.
  3. A third advantage of the University API is that it provides a single place to apply authorization policies. This is a huge advantage because it allows us to apply formal, specified policy parametrically to the API rather than doing it ad hoc. This results in more consistent and accurate data protection.
  4. Finally, a University API serves as a definition for the business of the University that befits the reality that more and more of the university's business is controlled and mediated via software systems. By designing a notation and semantics that matches what people believe the University to be, we document how the University's business is conducted.

How do you get started on such a monumental undertaking? We've created a University API team that is busy discussing, designing, and mocking-up APIs for a small set of interrelated, core resources. We hope to have mock-docs for review in the next few weeks. For now, we're focused on the following set:

/students
/instructors
/courses
/classes
/locations

Others that will eventually need to be considered include /colleges, /departments, /programs, and so on. There could easily be dozens of top-level resources in a university, but that's much more manageable than 900. And when they're logical and consistent that's especially true. For now we've identified five resources that form the core of the API and touch on activities that the university cares about most.

As you'd expect a GET on any of these resources returns a collection of all the members of that resources.

GET /students

Obviously some of these collections could be very large, so they will usually be filtered and paginated. Take /students for example. By rights, this resource should include not only all the current full-time students, but part-time students, independent study students, and so on—easily tens of thousands of records. Being able to filter this list so that it's the collection of students you want (e.g. all full-time students in the College of Engineering) will be critical.

Performing a GET on a resource with an identifier in the path returns the record for that identifier.

GET /students/:id

Again, the result could be very large. In theory, a student record contains everything the university knows about the student. In his white paper on University APIs, Kin Lane listed 11 types of data that might be in a student resource without even getting to things like transcripts, grades, applications, and so on. There are dozens of sub-resources inside a resource as complicated as /students. In practice, most programmers don't want (and aren't authorized to get) all the data about a particular student. We're attacking this problem by creating useful field sets for the most popular data for any given resource.

We're also dealing with issues such as the following:

  • What are the meta values for the API (e.g. what values are appropriate for a given field) and how should we represent them in the API?
  • How do we handle sub-resources? For example, a class has set of values for prerequisites that is a complex record in it's own right. But prerequisites doesn't deserve to be a top-level resource because it's only meaningful in the context of a course.
  • Many of the identifiers for a resource (like a class) are aggregate identifiers. In the case of a course the identifier is made from the term, department, course number, and section.
  • What is the right boundary between workflow and user interface. For example, when a student drops the class and that has cascading consequences, should the client or server be responsible for ensuring the student understands those consequences?
  • How deep do we go when returning a resource? For example, when we get a class enrollment, do we return links to the student records or the data about the students? If the latter, what does that communicate to developers about what can and can't be updated in the record?

There are new issues that come up all the time. We're still thinking, designing, and planning, so if you have suggestions, we'd love to hear them.

The effort has been fun and we're anxious to make at least part of this design exercise real. Watch for projects that tackle parts of this enterprise over the coming months.


Further reading: