Health care is largely an information business. Physical treatment – medicine, surgery, therapy, the tools used to deliver it – is the tip of the spear. But it is surrounded by, informed by, enabled by, the flow of information.
These days we manage all that information with software:
Software is complex: it is expensive to buy, difficult to build, and challenging to maintain.
Software is undependable: software fails; servers fail, networks fail.
Software never seems to be done: we need it to do more, and we need it to do that more more and more quickly.
The history of business software is largely the story of harnessing technological innovation to deliver better performing, more reliable, easier to maintain software, and to deliver it more quickly and cheaply.
APIs and microservices are simply the latest evolutionary features of modern software architecture addressing those same goals. They enable us to better manage complexity and increase reliability. We gain better times to market for new and improved solutions, lower delivery and sustainment costs, and improved responsiveness and availability.
Insofar as Healthcare is largely an information business you have more need than most to understand these new approaches. And the laundry lists of microservice benefits and cartoon diagrams of APIs you are usually provided won’t get it done.
APIs and microservices can only really be understood in a historical context. Here it comes.
A Brief History of APIs and Microservices
A software program – an application – used to be written by one person at a time, like craftsmen working before the industrial revolution.
Then we figured out how to ‘link’ together a software application from parts written by different programmers, so we could have many developers working in parallel delivering applications faster.
And a market sprang up for ‘libraries’ of such reusable software parts. The libraries became – and still are – part of the fabric of software development. The very languages we programmed in such as COBOL and C and PASCAL became dependent on their own supporting libraries.
Such reuse enabled specialization in the software labor market. You did not have to know how to write a statistics program to include statistical capabilities in your application – you just bought a statistics library.
But it was very challenging to make a program work when things did not go smoothly. ‘Debugging’ a program – finding and fixing errors in it – hit a wall in purchased library code. You did not have the actual program, the source code, you only had the executable form of the code, the form the computer understands, translated from source code into machine language.[1] So you could not read it and understand how it worked. You were dependent on the software vendor to provide support to help you debug your application. This was expensive and time-consuming, so debugging became in part the black art of inferring how library code worked from its behavior.
And applications were still resource constrained. An application could only use the processing power and memory – where the application’s compiled source code and data reside while it is running – on the single computer on which they were executing.[2]
Networking
Then we started networking computers together to exchange files.
And we realized that those same networks could be used to enable an application running on one computer to use a program running on another computer.
To do this they needed an interface, just as a human needs a user interface to use software. So we created ‘Application Programming Interfaces,’ or ‘APIs,’ to enable a software application running on one computer to work with – to ‘call’ – an application running on another computer.
A computer whose running software has no user interfaces, but only APIs, we call a ‘server’.
And a market sprang up for server-based applications with APIs. A database is just a special kind of software application. Databases such as Oracle were the first widely successful server-based applications.
The model we used for APIs was essentially the one we had learned with those reusable programming libraries.
We created special-purpose communication standards – protocols – for different kinds of applications such as those databases. A local application – a ‘client’ application – could connect to a database application running on a server via protocols called ‘ODBC’ or ‘JDBC’ or ‘OLE-DB’.
We called this ‘client-server’. Using these databases and similar applications required installing and configuring low-level software called ‘drivers’, a model borrowed from how software uses printers and teletypes and modems.
We created more generic standards – elaborate, complex software protocols called ‘CORBA’ and ‘DCOM’ and ‘RMI’ – to create the illusion that a program running on another computer was just another code module linked into our local program.
But this was all very difficult. Not only were there myriad protocols, they were arcane. Legend has it that at one time there was only one guy, Don Box, that fully understood DCOM. And the custom network security setup that had to be done to support them was challenging. There were lots of related problems.
Business Logic
The model of reuse we had learned from linkable libraries of programming components was failing to provide the broad reusability we were looking for. As we started to distribute our applications across multiple servers, with the logic – the business calculations and communication decisions – on a separate server, we started to move toward building reusable chunks of business logic instead of reusable chunks of programming logic.
We also realized, given the new distributed model for software, with parts of it running on different computers, we could improve the performance and reliability of applications without improving the quality of the software, servers, or network.
Instead of having an application use business logic on a single server, we installed the business logic on multiple servers. We introduced a new kind of software, called a ‘load balancer’, whose job was to distribute the call from an application to use business logic on one of the servers.
Not only did that increase the memory and computing power available to our application, it improved the reliability of the overall solution.
Here is how that works. Say you have a distributed application consisting of three parts, the user interface, the business logic, and the database, running on three different computers.
‘Availability’ is the measure of how much of the time an application is actually usable out of the time it is supposed to be. So if an application experiences outages an average of 3 hours a week but is supposed to be available for use 40 hours a week, its availability is 92.5%. [3] Availability measures reliability of the software, the server or servers it runs on, and the integration among them.
In a distributed application, if the user interface is available 99% of the time, and the database 99% of the time, but the business logic only 80% of the time, the overall availability is only 78.4%.[4]
We could spend the time and money to design, develop and test improvements to the business logic to raise the overall availability.
Or we could install three business logic servers with identical copies of the business logic application, and ‘load balance’ the application across them. If one of them fails, the other two are still available for use. The percentage of the time all three would be unavailable becomes very small. Our overall availability is raised to 97.2% without any changes to the software itself. [5]
It was – and is – frequently cheaper and faster to introduce redundancy than to improve an application. These redundant sets of servers – called ‘farms’ or ‘clusters’ – became commonplace.
But clusters brought their own challenge. If an application was working with a business logic server and it failed, it would lose track of the ‘conversation’ it was having with the application. We call this memory of a current conversation its ‘state’.
So we built elaborate mechanisms to share state information among all the servers in a cluster. These were arcane, complex, difficult to configure and maintain.
This drove us to start designing so-called ‘stateless’ servers – business logic servers where each individual use by an application is distinct and unrelated to any other use. That way we did not have any state we needed to share across a cluster of servers.
Web Services
Then the Internet, driven by the success of the World Wide Web, happened. At first we used it only to exchange files and create and read pages of information.
Then we realized that, just like the original local networks, we could use it to enable applications running on a local computer to call and use applications – applications developed by and hosted by other companies – running on other computers.
All the network plumbing to support the Internet and the World Wide Web was already in place.
So we created a new standard protocol, a new standardized way of linking distant applications together, allowing one application to call and use another using standard Internet plumbing.
We called the new approach Web Services. XML, a more rigorous version of HTML, the standard web page format, emerged as the standard for data used by Web Services.
And, confusingly, we started using ‘API’ more and more not to refer generically to the interfaces by which one application can call and use another, but more specifically to a Web Service formally managed for use by applications owned by others, whether trading partners or the public at large.
And a market for APIs sprang up – their adoption has been growing an average of over 30% a year for the last decade.[6]
Open Source
The Internet also enabled software developers to work together across company, state and country boundaries. A new software movement arose from their collaborations called ‘open source.’ Loosely governed communities of software developers worked together to create new software – reusable code libraries, applications, even operating systems. The software – including its source code – was made freely available over the Internet. Instead of the ones or tens of developers at a single company delivering software, hundreds or thousands of developers would contribute according to their interest, skills and availability. Having that many developers contributing to and using the software made all bugs trivial.[7]
Open source programs have become mainstream. Linux, an open source operating system, is used in over half of all US companies, and in 99% of the world’s supercomputers. If you are studying computer science in any college in the world today, you are using Linux. The open source Apache web server is the world’s most widely used.
Web services have co-evolved with open source. You can’t do services without using a significant number of open source components.
Service Oriented Architecture
Services were so successful that enterprises started using them, and the web-browser-based user interface using services application design, for their internal applications as well.
And Service Oriented Architecture or SOA was born. Business functionality would be built as reusable services.
Enterprise architects designed elaborate SOA cathedrals. We created master data models for the entire enterprise called ‘Enterprise Canonical Models’ so our services would all be interoperable. And we wired together services with data superhighways called ‘Enterprise Service Buses’ or ESBs so we could readily configure new arrangements of them.
And at the same time the original Web Service protocol exploded in complexity as it was extended to address all the needs and challenges of SOA. It became onerous to use.
And the widespread use of XML, the web service data standard, was also problematic, because working with it in a large-scale way demanded a different skillset than most developers or analysts had.[8]
A new way of doing web services emerged in backlash. Called REST[9], it was more a style than a specification. It was purposefully simple. Instead of creating an elaborate scheme for remote procedure calling, it used the same mechanism that was used to read and write Web Pages. And instead of XML, with all its woes, RESTafarians adopted a simplified data format called JSON.
The REST style is widely adopted at this point. When we talk about APIs these days, we generally mean a set of managed REST services with a common purpose managed as a unit, such as the Google Maps API.
And Service Oriented Architecture? Its heart was in the right place. But in the real world of legacy systems and tightening budgets, the big cathedral approach done with SOA frequently failed – as such approaches often do.
Microservices
From those hard-won lessons we are now figuring out how to make SOA work in the real world.
In the real world of the health care back office, there are usually a number of legacy third party applications we need to integrate.
And many of those applications are still integrated by batch files, not the discrete near-real-time transactions enabled by services.
We have learned a lot about how to integrate these ‘monolithic’ applications into an overall service-based approach. Just as we might introduce a component to translate one file format into another, we can create components to convert a batch file delivered daily into a stream of events, and a stream of events delivered near-real-time into a batch file delivered once daily.
We learned that using elaborate Enterprise Service Buses to make our service interoperable was overkill – we did not realize enough value from their overhead. ‘ESB’ started to stand for ‘Enterprise Spaghetti Box.’ What works better is keeping the ‘pipes’ simple, and keeping the logic in the services themselves.
It was even more important than we originally thought to keep services stateless – to have them have no memory of their current conversation with an application, but instead to have every conversation consist of single, brief sentences.
But state must live somewhere. We learned that creating master central ‘orchestrators’ – ‘Business Process Engines’ – which centrally managed the state of a transaction, calling services as needed to work on it, resulted in a high level of complexity and maintenance difficulty. What works better is a simple stupid approach called ‘choreography’, where each service only knows two things: what it needs to do, and where the transaction needs to go next. ‘Dumb pipes and smart endpoints’ became the new mantra.
We have also learned that having a set of related services in front of a single large monolithic database does not work well in practice. Such a database is rarely a perfect fit for the needs of a particular service – some services need to read and write single transactions at high speed, hundreds of times per second, others need to read very large analytics datasets twice a week. It is difficult to make single database design and implementation meet such varied needs. There are challenges from coupling services through a common database. Many changes to the database force you to test all the services that front it. Uses of the database by applications other than through the services can make them behave inconsistently.
Having a handful of large business services, such as ‘Claims’, that have lots of ways you can use them, also does not work well. Large, complex services with lots of options are difficult to maintain and challenging to version. Changes to them also force widespread testing.
We are moving instead toward the fundamental approach – there are always exceptions – of giving each service its own database, not letting anything else use that database except the service. The cost is some redundancy in the data. But it is proving by those who have adopted the approach to be worth it.
And we are moving toward making services do a single specific thing well, instead of being kitchen sinks of business functionality.
We call these dedicated, specific services with their own private data ‘microservices.’
The complexity must live somewhere, of course. One challenge with microservices is in the sheer number of them we will have. While we may have had tens of services under SOA, we will have hundreds under a Microservices architecture.
We can manage them because the cloud has brought us standardized, configurable automation for managing the deployment and integration of microservices. The microservice approach has co-evolved with cloud-driven automation.
Another challenge is that the aggregation of functionality has to happen somewhere. We can approach this initially by having each application do it, breaking out ‘aggregate’ services – higher level services which call microservices – where useful. We may find the need for additional scaffolding of some kind to support aggregation.
Microservices also tend to have redundant data that we must keep consistent. We do this by sourcing them all from a common stream of changes – ‘events’ – from the source systems. This consistency of data across services and the applications that use them does not happen instantly – it usually happens within seconds, but can sometimes take longer. We call this ‘eventual consistency.’
Using an eventual consistency model means writing applications that can adapt to after-the-fact changes. You may have seen this shopping on Amazon when you get a message that an item in your shopping cart has changed price – the cart is eventually made consistent with ongoing pricing changes.
We still need interoperability. But we are no longer trying to create a single master data model, an ‘Enterprise Canonical Model’, for the enterprise.
Instead, we are aligning our development teams and microservice domains to business capabilities. Each aligned unit will own its own data. We will still have the challenge of integrating data across business capabilities, but the scope of that is smaller and more manageable than in an enterprise canonical approach.
We are making the services available for reuse by managing them as API products. The use of API gateways enables us to apply consistent security policies and to monitor the use of our APIs. As we mature in our use of APIs we will all find having dedicated API product owners and technical leads a useful extension of our organizational structures.
The Bazaar
Software is still evolving – faster than ever. But we have learned from the lessons of the past. We are not getting locked in to a multi-year cathedral building effort. Instead, using APIs, Microservices and Events we are creating a virtual bazaar in the Cloud in which solutions written in different languages using different technologies will come and go as needed over time without interrupting the overall flow of healthcare information.[10].
Stay tuned.
[1] An executable program file is simply a machine language file – an ‘object file’ with a standard ‘header’ – the first part of the file – that tells the operating system, usually Windows or OS X or Linux, how to load the application into memory.
[2] We worked around the limits by developing elaborate mechanisms to swap parts of memory out to disk and back (called ‘paging’), making it seem like we had more memory than we actually had available. And we created other mechanisms that worked at a higher level to load and unload parts of an applications while the apps were running – to ‘link’ the parts together on the fly. In Windows we called these parts ‘Dynamic Linked Libraries’ or DLLs, and they were hell to use and made programs break. A lot.
[3] 40 hours desired minus 3 hours unavailable divided by 40 hours desired = 0.925 = 92.5%.
[4] The availability is calculated as (0.99 * 0.80 * 0.99) = 78.4%.
[5] The availability given three servers in the ‘middle’ business logic tier is calculated as (0.99 * ( 1-((1-0.80)^3) ) * 0.99) = 97.2%.
[6] https://www.programmableweb.com/news/programmableweb-api-directory-eclipses-17000-api-economy-continues-surge/research/2017/03/13
[7] ‘Linus’s Law’, coined by Eric S Raymond.
[8] Working with XML and its related tools demands a ‘functional’ approach, not a ‘procedural’ one. Unless an engineer had experience with a functional language such as Lisp or Haskell they would frequently struggle with XML tools such as XLSTs.
[9] REST = Representational State Transfer, coined by Web guru Roy Fielding in his doctoral dissertation.
[10] An homage to legendary hacker Eric S Raymond whose 1996 essay ‘The Cathedral and the Bazaar’ helped shoo in the modern approach to software.