This is the sixth in a series of posts (first, next, previous) in which I am exploring five key technology themes which will shape our work in the coming decade:
- The Emergence of the Individual Narrative;
- The Increasing Perfection of Information;
- The Primacy of Decision Contexts;
- The Realization of Rapid Solution Development;
- The Right-Sizing of Information Tools.
The Right-Sizing of Information Tools
The evolution of software applications and their integration is leading us to immersing our users in environments consisting of many small information tools, each dedicated to a specific task, instead of the large, monolithic applications that have come before.
High Coupling and Low Cohesion – the Right-Sizing Drivers
In software design we have known since the advent of object-oriented programming in the 60’s with Simula, the 70’s with Smalltalk, and its widespread commercial adoption in the late 1990’s that high ‘cohesion’ and low ‘coupling’, both terms coined by Larry Constantine (another of my heros), were two of the highest virtues of well-designed software and were two of the fundamental architectural objectives.
Cohesion is the extent to which a component does one and only one thing well, the extent to which it is successfully dedicated to single pure purpose.
Coupling is the extent to which a component or module is dependent on other components[1].
The chief disadvantages of low cohesion are these: as a component grows and changes over time the lack of a single common purpose results in a kind of urban sprawl that makes the code base increasingly difficult and expensive to modify and test; and the multiple different input and output needs tend to make the component’s integration in to the rest of the infrastructure a rat’s nest of connections of different kinds, which as we try to simplify and clean up tends to cause an increase in coupling across the integration components, which as we will see next is a Bad Thing.
Coupling has two fundamental disadvantages, one during development and maintenance, and the other at ‘runtime’ when the component is executing.
During development and maintenance, changes tend to ‘cascade’ over tightly-coupled connections, causing one or frequently more other components to have to be modified as well, increasing the development, testing and release burden.
And if the development time coupling has been missed – and it can easily be missed – sometimes you don’t find out there has been an error until runtime, when something that used to work no longer works, you don’t know why, and finding the error means digging into the internals of other components besides the one you were working on.
Also at runtime, tight pathological coupling means that a given component is critically dependent on the internal behavior of other components. It can be very difficult to even know if an error has occurred.
This is all common knowledge among software architects and engineers. But knowledgeable designers working toward those goals but immersed in the changing hardware and networking environment explains one of the fundamental shifts in software structure that has played out over the last twenty-five years.
The Rise and Fall of the Monoliths
Many of the applications and application platforms we use every day were first created in the 90’s. We have a perception in the healthcare industry that when we retired the mainframes and their old COBOL, we rid ourselves of the really old legacy stuff. That isn’t true. Facets, for example, the leading back office claims and membership processing software used at payers, was first coded at Erisco in the early to mid 90’s.
The core of Salesforce, which we think of as a ‘modern’ application insofar as it is based in the Cloud, was largely coded in the late 1990’s.
Back in the 90’s networking was becoming ubiquitous[2].
But from a software design standpoint, the use of the network was largely limited at first to client-server – having applications running on desktop PCs all sharing a common database on a network.
We were seeing the beginnings of distributed processing in the 90’s. But the prevailing standards – CORBA on Unix and distributed COM on Windows – were arcane and difficult to use. An apocryphal story from back in the day was that there was only one guy who really understood COM, Don Box. (Box later went on to be one of the original architects of the Web service standards).
Distributed component re-use was at the procedure and functional call level – logically deep and embedded in an application’s structure.
So the natural result was integration via shared database. Client-server applications individually grew larger, and to get around the memory constraints processing was divided up into separate applications – suites of related client-server applications all using a common database started to appear.
More and more features were added to these applications. And they grew and grew. This is starting to sound like a fairy tale, but there is not a happy ending.
As the market called for new features such as dynamic customization and workflow, functionality was piled onto the base applications. They were rarely reengineered from the ground up to accommodate the new features well – that is one of the riskiest things for a software company to do – so instead there were ‘shotgun’ marriages of code bases as they acquired and slapped on the functionality, or awkward Rube Goldbergian extensions of applications and databases.
What we ended up with are huge, sprawling, ‘monolithic’ applications whose features include everything including the kitchen sink.
Facets has workflow. Salesforce has workflow. PeopleSoft has workflow, Big Software Monolith A, B, and C have workflow. They are all ‘customizable’.
Software architects were not of course unaware of the twin tenets of high cohesion and low coupling. Along with other architectural factors including improved network capabilities they led us to the so-called n-tier architecture (at first ‘3-tier’), where there was an effective ‘separation of concerns’ between the user interface, the business logic, and the database. There were programming standards which enabled n-tier such as Java 2 Extended Edition (J2EE). Driven by the emergence of design patterns for J2EE and configurable ‘application containers’ such as IBM’s WebSphere, which implemented a stack of J2EE functionality on servers, n-tier applications dominated the back office.
As web browsers grew more powerful, and especially as new standards emerged to support interaction between the browser and the web server at a finer grain than a whole web page, the web browser emerged as the ‘presentation layer’ client of choice. Everybody had one, you didn’t have to roll any software out to the desktop. But ultimately there was no marked functional improvement.
In general the dream of large scale reuse of components had still not been realized. There were some attempts to reuse objects at the intersection of business logic and the database, the so-called object-relational mapping (ORM) layer. But the reuse mechanisms built into ORM technologies such as TopLink were brittle and nigh on impossible to make work at scale.
Reuse of business logic in the ‘middle tier’ was possible in theory, but rarely if ever well realized.
The Emergence of APIs
But now, as I discussed at more length in my previous post on the theme of the realization (at last) of rapid application development, the nature of distributed processing has changed. Networks are larger and faster.
But more importantly, the new grain of code reuse is at the large business feature level of services (and the schools of services that swim together we call ‘APIs’).
The developers of monolithic applications have paid lip-service to a service-based approach by exposing functionality in the web service standard.
But to publish a host of fine-grained function calls as ‘services’ is to completely miss the point.
With well-designed services (a large part of the definition of ‘well-designed’ is that the service exhibits high cohesion and low coupling) it has become increasingly possible to coordinate discrete business behaviors to common ends.
Applications have become smaller and lighter clients that coordinate services.
Current programming frameworks, such as Angular JS and React Native, current integration capabilities, and service-oriented architectures – whether Cloud-based or not, that really doesn’t matter here – all conspire to enable what I call ‘right-sized’ solutions.
Instead of a monolithic application to meet all the needs of a business domain such as, for payers, Claims Management or Care Management, we can have a host of small solutions each dedicated to a very specific business decisions and problems. These ‘right-sized’ solutions can be designed, built, tested and rolled out in a matter of weeks, not months or years. The lifecycle of solutions will be more like that of flies than elephants: solutions will spring up as needed, stay while needed, then be swatted away.
To those who might argue that the user experience would be too fragmented, that users could not be expected to learn that many different discrete applications, that the training costs would be huge, I simply say this: pull your smartphone out of your pocket, count how many apps you have, then multiply that by the estimated 6 billion+ smartphone users world-wide. Now tell me how much training time was and will be involved.
This API-based future is already starting to be realized. There are more than 80 billion calls to APIs a day right now. The number of APIs is expected to exceed 300K by the end of this year[3].
In the landscape of the 2020’s tiny-special purpose apps working against a mature service infrastructure will be the dominant shape of software solutions. Services, managed in APIs, largely based in the Cloud, consumed by tiny clients, will provide discrete, elegant solutions to specific business needs.
Choreography and Orchestration
There are two ways those services will be coordinated in the integration space: choreography and orchestration.
In choreography, discrete autonomous transactions are passed from one component to the next. Each component ‘knows’ how to do two things: its function, such as translating or augmenting data, and how to pass the transaction to the right next component until the end is achieved. As with cellular automata, complex overall behaviors can result from simple actions.
In orchestration, a central conductor manages the transaction, calling services in parallel or in turn as they are needed to achieve the desired goal.
With rare exception choreography is preferable: it is cheaper, less complex, easier to implement, and easier to scale.
The software battleground in this decade will be fought under the banners of consistency, ordering, causality – the fight to have distributed copies of the state of entity and object records be coherent in time and space.
There are two main fronts: eventual consistency, where conflict resolution is key terrain – there are Conflict Free Replicated Datatypes (CRDTs) in your future – and ‘backplane’ distributed ACID transactions, where all the pain and complexity of managing hard transactions is done behind the scenes of the data service – the Cloud Spanners, CockroachDBs, and YugaByte.
More about all that in a future series. In the meantime, and perhaps instead, Martin Kleppman’s Desiging Data-Intensive Applications is a great source to tap for more information.
Next time we will drill into the business impacts of our five technology themes.
Stay tuned.
[1] It also refers to the nature of that coupling. Formally, there are different kinds of coupling. From low to high, or loose to tight, they are message, data, stamp, control, external, common, and content. When used in discussions such as this one, ‘coupling’ is understood to mean the pathological kinds, ‘content’ and ‘common’ coupling. There must be some coupling, or components could not communicate.
[2] For you techies I note that one of the drivers of the explosion of networking was the advent of 10Base-T Ethernet – it could be done with twisted pair wiring instead of more expensive coaxial cabling, it was full-duplex, and its virtual bus/star topology made it both robust and easy to troubleshoot.
[3] https://www-950.ibm.com/events/wwe/grp/grp037.nsf/vLookupPDFs/Pres2KH/$file/Pres2KH.pdf