Technorati Profile Blog Flux Local IT Transformation with SOA
>

Friday, January 8, 2010

Data Mapping & Transformation—Part I


A basic tenet should be to keep transformations to a minimum. However, it is not always feasible to create completely homogeneous systems. Even if one wishes to use standard communication means between systems, the reality is that there will be sometimes a need to handle data and protocol mismatches. We frequently must interface the new system with legacy components, or support a federated environment with differing protocols and presentations. More often than not, the system will need to interface with a third party component that does not use the standard format, semantics or protocols.  The question that now emerges is how best to make these components talk to each other? In what components should we deploy the transformation logic?
There are various schools of thought about how to approach the thorny subject of transformation. We end up with the following alternatives: 
·      Broker makes right
·      Sender makes right
·      Receiver makes right
·      Sender and Receiver use a common (“canonical”) format
Broker-Makes-Right is akin to the United Nations with a diplomat giving a speech in, say, Russian, and then having the various translators translating the language for the listeners.
While this works in the UN because the translators are actual human beings capable of human understanding, when relying on automation, the most you can expect the intermediate to perform is straightforward X-to-Y transformations by following pre-defined mapping rules. Unlike UN translators, automated brokers lack the understanding to intelligently optimize the mapping based on cognition.  Broker mediated transformations only make sense when mapping low-level formatting mismatches. Do you need to convert an integer to a string? No problem. Use an intermediate component. Want to append a null termination to strings coming from A and destined for B? Again, fine.
Then there are cases where it is never advisable to let the broker perform automated conversions. For instance, currency conversions from Euros to Dollars may require the knowledge of specific exchange discounts or the application of exchange fees, or anything else that can be dreamed up by government bureaucracies. These types of “business-based” conversions must be placed in the upper layers that are capable of handling business rules, and not in an intermediate broker.
Sender-Makes-Right advocates argue that the sender of the message should know the capabilities of the recipient and adjust the format and characteristics of the delivery to match the recipient’s capabilities. An analogy is that of a teacher communicating to a group of kindergarteners. The teacher will not use complex words and will make an effort to match her language in a way the children will understand. Sender-Makes-Right assumes the sender typically knows-it-all and has the power to adapt its messages to nearly any recipient.
Receiver-Makes-Right proponents believe there is no way a sender will always know the capabilities and limitations of the receiver. Secondly, they argue that the sender is not necessarily the most powerful component of the system. Receiver-Makes-Right proponents argue that the recipient of a message should be able to extract what is needed and transform, as appropriate, the sender’s format.  Obviously, if the scope of information delivered by the sender exceeds what can be handled by the receiver, it is the receiver’s prerogative to dispose of the excess.  If the sender has less knowledge than the receiver, then it is easier for the receiver to map the sender’s format and complement the needed information through other means. An example is the manner in which Google applies complex heuristics to infer the user requests.
A final form is what I call the Esperanto approach, more commonly referred to as the canonical style. Here, both the sender and the receiver agree to use a common language, and both take responsibility for translating their respective formats into this common standard.
Which is the best approach? Clearly each method has both unique issues and advantages. Nest week I’ll go over the recommended approaches.

Labels: , , , , ,

Friday, December 4, 2009

Taming the SOA Complexities


Remember when I used to say, “Architect for Flexibility; Engineer for Performance”? Well, this is where we begin to worry about engineering for performance. This section, together with the following SOA Foundation section represents the Level III architecture phase. Here we endeavor to solve the practical challenges associated with SOA architectures via the application of pragmatic development and engineering principles.


On the face of it, I wish SOA were as smooth as ice cream. However, I regret to inform you that it is anything but.  In truth, SOA is not a panacea, and its use requires a fair dose of adult supervision. SOA is about flexibility, but flexibility also opens up the different ways one can screw up (remember when you were in college and no longer had to follow a curfew?).  Best practices should be followed when designing a system around SOA, but there are also some principles that may be counter-intuitive to the “normal” way of doing architecture. So, let me wear the proverbial devil’s advocate hat and give you a list from “The Proverbial Almanac of SOA Grievances & Other Such Things Thusly Worrisome & Utterly Confounding”:
·         SOA is inherently complex. Flexibility has its price. By their nature, distributed environments have more “moving” pieces; thereby increasing their overall complexity.
·         SOA can be very fragile. SOA has more moving parts, leading to augmented component interdependencies.  A loosely coupled system has potentially more points of failure.
·         It’s intrinsically inefficient. In SOA, computer optimization is not the goal. The goal is to more closely mirror actual business processes. The pursuit of this worthy objective comes at the price of SOA having to “squander” computational resources. 
The way to deal with SOA’s intrinsic fragility and inefficiency is by increasing its robustness.  Unfortunately, increasing robustness entails inclusion of fault-tolerant designs that are inherently more complex.  Why? Robustness implies deployment of redundant elements. All this runs counter to platonic design principles, and it runs counter to the way the Level I architecture is usually defined. There’s a natural tension because high-level architectures tend to be highly optimized, generic, and abstract, referencing only the minimum detail necessary to make the system operate. That is, high level architectures are usually highly idealized—nothing wrong with it. Striving for an imperfect high level architecture is something only Homer Simpson would do. But perfection is not a reasonable design goal when it comes to practical SOA implementations.  In fact, perfection is not a reasonable design goal when it comes to anything.
Consider how Mother Nature operates.  Evolution’s undirected changes often result in non-optimal designs. Nature solves the problem by “favoring” a certain amount of redundancy to better respond to sudden changes and to better ensure the survival of the organism. “Perfect” designs are not very robust. A single layered roof, for example, will fail catastrophically if a single tile fails. A roof constructed with overlapping tiles can better withstand the failure of a single tile. 
A second reason SOA is more complex is explained by the “complexity rule” I covered earlier: the more simplicity you want to expose, the more complex the underlying system has to be. Primitive technology solutions tend to be difficult to use, even if they are easier to implement.  The inherent complexity of the problem they try to solve is more exposed to the user. If you don’t believe me consider the following instructions from an old Model T User Manual from Ford:
 “How are Spark and Throttle Levers Used? Answer: under the steering wheel are two small levers. The right- hand (throttle) lever controls the amount of mixture (gasoline and air) which goes into the engine. When the engine is in operation, the farther this lever is moved downward toward the driver (referred to as “opening the throttle”) the faster the engine runs and the greater the power furnished. The left-hand lever controls the spark, which explodes the gas in the cylinders of the engine.”
Well, you get the idea. SOA is all about simplifying system user interactions and about mirroring business processes.  These goals force greater complexity upon SOA. There is no way around this law.
There are myriad considerations to take into account when designing a services-oriented system.  Based on my experience I have come up with a list covering some of the specific key techniques I have found effective in taming the inherent SOA complexities.  The techniques relate to the following areas that I will be covering next:
State-Keeping/State Avoidance. Figuring out under what circumstances state should be kept has a direct relevance in determining the ultimate flexibility of the system.
Mapping & Transformation. Even if the ideal is to deploy as homogenous a system as possible, the reality is that we will eventually need to handle process and data transformations in order to couple diverse systems. This brings up the question as to where is best to perform such transformations.
Direct Access Data Exceptions. As you may recall from my earlier discussion on the Data Sentinel, ideally all data would be brokered by an insulating services layer. In practice, there are cases where data must be accessed directly. The question is how to handle these exceptions.
 Handling Bulk Data. SOA is ideal for exchanging discrete data elements. The question is how to handle situations requiring the access, processing and delivery of large amounts of data.
Handling Transactional Services.  Formalized transaction management imposes a number of requirements to ensure transactions have integrity and coherence. Matching a transaction-based environment to SOA is not obvious.
Caching. Yes, there’s a potential for SOA to exhibit a slower performance than grandma driving her large 8-cylinder car on a Sunday afternoon. The answer to tame this particular demon is to apply caching extensively and judiciously.
All the above techniques relate to the actual operational effectiveness of SOA. Later on I will also cover the various considerations related to how to manage the SOA operations.
Let’s begin . . .

Labels: , , , , , , ,

Friday, November 13, 2009

ESB and the SOA Fabric


A number of new needs have emerged with the advent of SOA. First of all, there was no standard way for an application to construct and deliver a service call.  Secondly, there was no standard way to ensure the service would be delivered.  Thirdly, it was not clear how this SOA environment could be managed and operated effectively. Fourthly . . .  well, you get the idea; the list goes on and on.
SOA demands the existence of an enabling infrastructure layer known as middleware. Middleware provides all necessary services, independent of the underlying technical infrastructure.  To satisfy this need, vendors began to define SOA architectures around a relatively abstract concept: the Enterprise Service Bus, or ESB.   Now, there has never been disagreement about the need to have a foundational layer to support common SOA functions—an enterprise bus of sorts. The problem is that each vendor took it upon himself to define the specific capabilities and mechanisms of their proprietary ESB, oftentimes by repackaging preexisting products and rebranding them to better fit their sales strategies.
As a result, depending on the vendor, the concept of Enterprise Service Bus encompasses an amalgamation of integration and transformation technologies that enable the cooperative work of any number of environments: Service Location, Service Invocation, Service Routing, Security, Mapping, Asynchronous and Event Driven Messaging, Service Orchestration, Testing Tools, Pattern Libraries, Monitoring and Management, etc. Unfortunately, when viewed as an all-or-nothing proposition, ESB’s broad and fuzzy scope tends to make vendor offerings somewhat complex and potentially expensive.
The term ESB is now so generic and undefined that you should be careful not to get entrapped into buying a cornucopia of vendor products that are not going to be needed for your specific SOA environment.  ESBs resemble more a Swiss army knife, with its many accessories, of which only a few will ever be used. Don’t be deceived; vendors will naturally try to sell you the complete superhighway, including rest stops, gas stations and the paint for the road signs, when all you really need is a quaint country road. You can be choosy and build your base SOA foundation gradually.  Because of this, I am willfully avoiding use of the term “Enterprise Service Bus”, preferring instead to use the more neutral term, “SOA Fabric.”
Of all the bells and whistles provided by ESB vendors (data transformation, dynamic service location, etc.), the one key function the SOA Fabric should deliver is ensuring that the services and service delivery mechanisms are abstracted from the SOA clients.
A salient feature that vendors tell us ESBs are good for is their ability to integrate heterogeneous environments. However, if you think about it, since you are going through the process of transforming the technology in your company (the topic of this writings after all!), you should really strive to introduce a standard protocol and eliminate as many of legacy protocols as you can.
Ironically, a holistic transformation program should have the goal of deploying the most homogeneous SOA environment possible; thus obviating the need for most of the much touted ESB’s transformation and mapping functions. In a new system, SOA can be based upon canonical formats and common protocols; thus minimizing the need for data and service format conversion. This goal is most feasible when applied to the message flows occurring in you internal ecosystem.
Now, you may still need some of those conversion functions for several other reasons, migration and integration with external systems being the most obvious cases. If the migration will be gradual, and therefore requires the interplay of new services with legacy services, go ahead and enable some of the protocol conversion features provided by ESBs. The question would then be how important this feature is to you, and whether you wouldn’t be better off following a non-ESB integration mechanism in the interim.  At least, knowing you will be using this particular ESB function only for migration purposes, you can try to negotiate a more generous license with the vendor.
There are cases whereby, while striving for a homogeneous SOA environment, you may well conclude that your end state architecture must integrate a number of systems under a  federated view. Your end state architecture in this case will be a mix of hybrid technologies servicing autonomous problem domains. Under this scenario, it would be best to reframe the definition of the problem at hand from one of creating an SOA environment to one of applying Enterprise Application Integration (EAI) mechanisms. If your end state revolves more around integration EAI, it would be better suited to performing boundary-level mapping and transformation work.  In this case, go and shop for a great EAI solution; not for an ESB.
If the vendor gives you the option of acquiring specific subsets of their ESB offering (at a reduced price) then that’s something worth considering. At the very least, you will need to provide support for service deployment, routing, monitoring, and management, even if you won’t require many of the other functions in the ESB package. Just remember to focus in deploying the fabric that properly matches your SOA objectives and not the one that matches your vendor’s sales quota.
A quick word regarding Open Source ESBs. . . There are many, but the same caveats I’ve used for vendor-based ESB’s apply. Open Source ESBs are not yet as mature, and the quality of functions they provide varies significantly according to the component. Focus on using only those components you can be sure will work in a reliable and stable manner or those which are not critical to the system. Remember you are putting in place components that will become part of the core fabric. Ask yourself, does it make sense in order to save a few dollars to use a relatively unsupported ESB component for a critical role (Service Invocation or Messaging, come to mind), versus using a more stable vendor solution?
In the end, if you are planning to use the protocol conversion features packaged in a vendor-provided or open source ESBs, I suggest you use them in a discrete, case-by-case basis, and not as an inherent component of your SOA fabric. This way, even as you face having to solve integration problems associated with the lack of standards, at least you won’t be forced into drinking the Kool-Aid associated with a particular vendor’s view of ESB!

Labels: , , , , , , , , , , , ,