One of the most oft heard critiques against SOA is that the overhead of SOAP/XML formats make it intrinsically low performing. Yes, we all know that standards are often the result of consensus and aren’t always optimized, but SOA’s flexibility is needed to avoid recreating the monolithic “all-is done-here” view of the older development culture. There is no doubt that SOA architectures can be affected by message transmission delays due to larger message sizes resulting from standardization and overheads associated with modular designs.
So, how to solve this conundrum?
A common mistake is designing with the idea of avoiding these performance problems “from the start”. The outcome? Designs that are too monolithic, and that introduce inflexible interfaces with tightly coupled inter-process calls “in the interest of performance”. Talk about throwing out the baby with the bathwater! A better approach is to design for flexibility, as the SOA gods intended, but to introduce the safety valve of caching throughout the system. Caching is the technique used to preserve recently used information as close as possible to the user so that it can be accessed more rapidly by a subsequent caller. Think of caching as a series of release valves needed to ensure the flow of services occurs as pressure-free as possible from beginning to end.
The idea is to design a system that allows as many caching points as possible. This does not mean you will actually utilize all the caching points. Ironically, there is a performance penalty to caching and you should therefore make certain to follow these tenets when it comes to its use:
·Ensure that the caching logic operates asynchronously from the main execution path in order to avoid performance penalties due to the management of the cache.
·Ensure you use the appropriate caching strategy. There are several different strategies that apply to specific data dynamics. Should you clear the cache based on least-used, oldest, or most recently added criteria? Will you implement automatic caching space recollection techniques (i.e. have a daemon periodically releasing cached elements in the background) or will you do so only when certain thresholds are crossed?
·The rules for caching should be flexible and controllable from a centralized management console. It is imperative to always have real-time visibility of the various cache dynamics and to be able to react appropriately to correct any anomalies. Use the recommended cache flag field in the message headers to give you more controlled granularity of these dynamics.
·Allow pre-loading of caching, or sufficient cache warm-up, prior to opening the applications to the full force of requests.
·Always remember that blindly caching items is not a magic bullet. The success of caching depends significantly on the items you cache. If the items change very frequently, you will have to update the cache frequently as well and this overhead could upset any caching advantages..
Even though there are vendor products that provide single-image views of distributed systems caching, I recommend using them only for well-defined server clusters and not broadly for the entire system. You will be better off designing custom-made caching strategies for each particular service call and data element in your solution. There are several caching expiration strategies, such as time-based expiration, size-based expiration (expiring the oldest x% of cache entries when a certain cache threshold is reached), and change-triggered cache updates using a publish/subscribe mode.
Selecting the right expiration and refresh strategy is essential in ensuring the freshness of your data, high hit cache ratios (low cache ratios can make overall system performance suffer because of the overhead incurred in searching for a non-existing item in the cache), and avoidance of performance penalties due to cache management. Also, if you can preserve the cache in a non-volatile medium in order to permit rapid cache restore during a system start-up, then do so.
Clearly, choosing what data to cache is essential. Data that changes rapidly or whose precision is critical should not be cached (e.g. available product inventory should only be cached if the amount of product in the inventory is larger than the amount of the largest possible order). You’ll need to assess how fresh data must be, for any situation. The optimum strategy must be determined carefully via trial-and-error. You can also apply analytical methods such as simulation (see later) to better estimate the impact of any potential change to either the characteristics of the data being cached, or the preferred caching approach.
Finally, I can’t emphasize enough the need for accurate caching monitoring via use of real-time dashboards. These dashboards are a core component of the infrastructure needed to properly manage a complex SOA system. More on Managing SOA next.
Last week, I outlined the various mapping options. The question I left hanging was this: Which approach is the best one?
My experience is that the transformation should take place as soon as possible. For starters, this means that broker-mediated transformations should be avoided, if possible. The entity doing the transformation must have an understanding of the business processes being mapped and intermediate brokers usually lack this knowledge.
Best is to establish a canonical (i.e. standard) format and then allow both the receiver and the sender translate their respective formats into the chosen canonical form (performance considerations can be dealt with later). For example, in the modern world, English can be seen as the standard used by all people—A German can communicate with Japanese in English. In SOA terms, this canonical form may well be a specific set of XML structures.
If a standard protocol is feasible, you will need to decide whether this format will be a subset (a lowest common denominator) of all formats, or whether you will allow the format to carry functions that exceed the capabilities of either one or both of communicating entities. If the former, you will be forced to “dumb-down” the functionality; if the latter, you will need to restrict the information conveyed by the canonical format in a case-by-case basis. Still, it’s best to make the standard format as comprehensive as possible. It’s always easier to restrict usage of excess functionality than it is to introduce new features during implementation.
If no standard format is feasible because you can’t control the sender or receiver, then you should adopt either a Sender-Makers-Right or a Receiver-Makes-Right approach. In general, the entity that has the better understanding of the business process should take ownership of the mapping. For example, if you a tourist in another country and use of a canonical language (aka “English) is not possible, then it behooves you to try to speak their local language (i.e. Sender-Makes-Right). After all, it’s unrealistic to expect the local folks will speak your language. On the other hand, if you are visiting the tourism board in a foreign country then you may reasonably assume someone there might speak your language.
Typically, the Sender has a better understanding of the meaning (i.e. “semantics”) of a request. Consider the example where the requester searches for an employee record using the name. The name is in a structured fashion: LastName, FirstName. The server, on the other hand, expects to get the request with a string that contains the “last_name+first_name” (this is a common scenario when the server is a legacy application). The scenario is obvious (I mentioned this was a trivial example!). The requestor (the sender) should create the necessary string. Building the string is much easier for the sender than it is for the receiver. The sender knows the true nature of the last name, while the server’s logic could fail if it tried to derive the last name from regular expression parsing. (I can’t tell you the number of times I have encountered systems that assume that DEL is my middle name!) Cleary the simple parser used by such software fails to understand that some last names have a space.
This recommendation still leaves open the question of where to do the mapping everything else being equal. My personal view is that when everything is equal, you should put the mapping logic in the server of the request (i.e. Receiver-Makes-Right), simply because it gives you a centralized, single point of control for the mappings. Relying on a Sender-makes-Right scenario places much of the burden on what could eventually become an unmanageable variety of clients. Also, I do suggest that if you decide for one or the other, that you don’t ever mix the approach. That is, if you decide to do a Sender-Makes-Right, do so throughout the system, or vice versa. The hybrid case with mixing Receive-Making-Right with Sender-Making-Right can make the system far too complex and unmanageable.
The corollary to this discussion is that there is a hybrid approach that I believe provides the most flexibility and solves the great majority of transformation needs: using a comprehensive canonical form combined with a Receiver-Makes-Right for cases where the super-set capability exceeds the receiver’s ability. The logic to this approach is that it is easier to down-scope features than it is to second-guess a more powerful capability.
Consider a typical search application scenario: A client sends a search request and the server then prepares a response which includes the found elements; plus ranking scores related to each item returned. The Sender converts the ranking weight factors from a relational database into a “standardized” ranking score system defined by the canonical form. Now, let’s assume the client (the receiver of the response) is not prepared to get or use this extra information. The receiver simply discards the extra information. The down-scoped information loses some of its value, but the client will still be able to present the search results, even if not in a ranked fashion. As long the key results are obtained no major harm occurs. A future, more competent client will be able to use the ranking information. Note that this approach only works if the information being ignored is not essential to the response. If you have a need to ensure essential information is not discarded, you’ll have to define this information as core to the canonical standards.
Yes. Transformation work is sure to have an impact on performance. Next I will cover a technique used to remediate this problem: Caching.
Remember when I used to say, “Architect for Flexibility; Engineer for Performance”? Well, this is where we begin to worry about engineering for performance. This section, together with the following SOA Foundation section represents the Level III architecture phase. Here we endeavor to solve the practical challenges associated with SOA architectures via the application of pragmatic development and engineering principles.
On the face of it, I wish SOA were as smooth as ice cream. However, I regret to inform you that it is anything but. In truth, SOA is not a panacea, and its use requires a fair dose of adult supervision. SOA is about flexibility, but flexibility also opens up the different ways one can screw up (remember when you were in college and no longer had to follow a curfew?). Best practices should be followed when designing a system around SOA, but there are also some principles that may be counter-intuitive to the “normal” way of doing architecture. So, let me wear the proverbial devil’s advocate hat and give you a list from “The Proverbial Almanac of SOA Grievances & Other Such Things Thusly Worrisome & Utterly Confounding”:
·SOA is inherently complex. Flexibility has its price. By their nature, distributed environments have more “moving” pieces; thereby increasing their overall complexity.
·SOA can be very fragile. SOA has more moving parts, leading to augmented component interdependencies. A loosely coupled system has potentially more points of failure.
·It’s intrinsically inefficient. In SOA, computer optimization is not the goal. The goal is to more closely mirror actual business processes. The pursuit of this worthy objective comes at the price of SOA having to “squander” computational resources.
The way to deal with SOA’s intrinsic fragility and inefficiency is by increasing its robustness. Unfortunately, increasing robustness entails inclusion of fault-tolerant designs that are inherently more complex. Why? Robustness implies deployment of redundant elements. All this runs counter to platonic design principles, and it runs counter to the way the Level I architecture is usually defined. There’s a natural tension because high-level architectures tend to be highly optimized, generic, and abstract, referencing only the minimum detail necessary to make the system operate. That is, high level architectures are usually highly idealized—nothing wrong with it. Striving for an imperfect high level architecture is something only Homer Simpson would do. But perfection is not a reasonable design goal when it comes to practical SOA implementations. In fact, perfection is not a reasonable design goal when it comes to anything.
Considerhow Mother Nature operates. Evolution’s undirected changes often result in non-optimal designs. Nature solves the problem by “favoring” a certain amount of redundancy to better respond to sudden changes and to better ensure the survival of the organism. “Perfect” designs are not very robust. A single layered roof, for example, will fail catastrophically if a single tile fails. A roof constructed with overlapping tiles can better withstand the failure of a single tile.
A second reason SOA is more complex is explained by the “complexity rule” I covered earlier: the more simplicity you want to expose, the more complex the underlying system has to be. Primitive technology solutions tend to be difficult to use, even if they are easier to implement. The inherent complexity of the problem they try to solve is more exposed to the user. If you don’t believe me consider the following instructions from an old Model T User Manual from Ford:
“How are Spark and Throttle Levers Used? Answer: under the steering wheel are two small levers. The right- hand (throttle) lever controls the amount of mixture (gasoline and air) which goes into the engine. When the engine is in operation, the farther this lever is moved downward toward the driver (referred to as “opening the throttle”) the faster the engine runs and the greater the power furnished. The left-hand lever controls the spark, which explodes the gas in the cylinders of the engine.”
Well, you get the idea. SOA is all about simplifying system user interactions and about mirroring business processes. These goals force greater complexity upon SOA. There is no way around this law.
There are myriad considerations to take into account when designing a services-oriented system. Based on my experience I have come up with a list covering some of the specific key techniques I have found effective in taming the inherent SOA complexities. The techniques relate to the following areas that I will be covering next:
State-Keeping/State Avoidance. Figuring out under what circumstances state should be kept has a direct relevance in determining the ultimate flexibility of the system.
Mapping & Transformation. Even if the ideal is to deploy as homogenous a system as possible, the reality is that we will eventually need to handle process and data transformations in order to couple diverse systems. This brings up the question as to where is best to perform such transformations.
Direct Access Data Exceptions. As you may recall from my earlier discussion on the Data Sentinel, ideally all data would be brokered by an insulating services layer. In practice, there are cases where data must be accessed directly. The question is how to handle these exceptions.
Handling Bulk Data. SOA is ideal for exchanging discrete data elements. The question is how to handle situations requiring the access, processing and delivery of large amounts of data.
Handling Transactional Services. Formalized transaction management imposes a number of requirements to ensure transactions have integrity and coherence. Matching a transaction-based environment to SOA is not obvious.
Caching. Yes, there’s a potential for SOA to exhibit a slower performance than grandma driving her large 8-cylinder car on a Sunday afternoon. The answer to tame this particular demon is to apply caching extensively and judiciously.
All the above techniques relate to the actual operational effectiveness of SOA. Later on I will also cover the various considerations related to how to manage the SOA operations.
This blog covers the practical techniques, trials and tribulations associated with the transformation of IT systems from legacy technologies to systems using SOA and modern open systems. I also include the occasional interlude with rants about technology in general.
About Me
Name: Israel del Rio
Location: Atlanta, GA, United States
Israel has been recognized by Computerworld as one of their Premiere 100 IT honorees. Israel is a business and technology leader who has contributed the technology vision as key strategist and designer behind the enterprise technology roadmaps of large hospitality and travel companies. Israel has also have developed and deployed various mission-critical systems and in the process, he's been instrumental in creating and building effective and skilled development organizations.