Technorati Profile Blog Flux Local IT Transformation with SOA
>

Friday, February 5, 2010

Best Performance Practices


As mentioned earlier, using “thin” services that require multiple trips to the server to obtain a complete response is one of the most common performance mistakes made with SOA.  Most other SOA performance problems occur due to basic engineering errors such as miss-configurations (low memory pools, bad routings, etc.) which can be fixed with relative ease once identified. Performance problems caused by inappropriate initial design are much harder to correct:
·         Inefficient implementation. The advent of high level and object oriented languages does not excuse the need to tighten algorithms. Many performance problems are the result of badly written algorithms or incorrect assumptions about the way high-level languages handle memory and other resources.

·         Inappropriate resource locks and serialization. Just as it is not an good idea to design a four-lane highway that suddenly becomes a one lane bridge, best practice design avoids synchronous resource-locking as much as possible. Its’ best to implement service queues whenever possible to take advantage of the multitasking and load balancing capabilities provided by modern operating systems.  Still, avoid using asynchronous modes for Query/Reply exchanges.

·         Unbalanced workloads. This is a scenario more likely to occur when services must run from a particular server due to the need to keep state or because the services are not configured correctly. The more you can avoid relying on state, the more capable you will be in avoiding unbalanced workloads.

·         Placing the logic in inappropriate places. Don’t let grandma drive that Lamborghini. Emerging web site implementations were developed with an organic view that placed business logic in the front-end portals.  So-called Content Management Systems were developed to provide flexible frameworks for these web portals. Unfortunately, this architecture pattern leads to monolithic, non-scalable designs. Despite the assumed performance overheads implied by modular designs, it is best to put the business logic in back-end engines that can be accessed via services through front-end portals.
Designers aware of SOA’s inherent inefficiencies, tend to architect the system in a traditionally monolithic manner.  However, it is a mistake to shy away from the use of services during the design phase just to “preemptively” alleviate performance concerns. You risk reducing flexibility in the design and this defeats one of the main reasons for the use of SOA.
There are many other, better ways to remedy the performance concerns of SOA:
·         Applying best practices in service design. Watch for service granularity, service flows and the use of superfluous execution paths. For example, avoid “in-band” logging of messages (control messages mixed with the application data-carrying messages). That is, quickly copy the messages to be logged and handle them asynchronously to the main execution path. Make the logging process a lower priority than application work (alerts must be the highest priority!).

·         In SOA, caching is essential. Caching is to SOA what oil is to a car’s engine. Without caching, there is no real opportunity to make SOA efficient and thereby effective. However, provided that the necessary enablers are in place (i.e. ability to use caching heavily), performance is an optimization issue to be resolved during system implementation (remember the dictum: Architecture is about flexibility; engineering about performance.)

·         Finally, with SOA there is a need to proactively measure and project the capacity of the system and the projected workloads. Modeling and Simulation must be a part of the SOA performance management toolkit.

More on each of these next . . .

Labels: , , , ,

Friday, January 29, 2010

Managing SOA—The Control Layer


You should maintain control of your SOA environment by ensuring that all SOA messages in your system comply with a service framework that incorporates a standardized service stub containing necessary control elements for each message.  Whether using a federated ESB or your own canonical approach, you must ensure that every SOA message contains the following elements:
          Versioning. This will enable you to gracefully introduce new versions of services and interfaces. The service routing fabric (often part of the ESB) will be able to use this information to help decide whether to send the service request to one implementation of the service versus another. Clearly, service versioning should be used sparingly and judiciously as it could become a de-facto means of creating new families of services and thus make future control of service implementations more difficult.

          Prioritization. The SOA middleware may be in the position to deliver services under pre-defined level agreements.

          Sequencing/Time-stamping. It’s always a good idea to introduce an ordinal counter for each service request. Ideally, if the response to the service is atomic and can be associated with a request, the response should also incorporate the ordinal number of the request. This type of information can be used for debugging purposes, or even to give the client the ability to associate a response to a request without having to keep state. Time-stamping all services is good way to ensure the potential tracking of performance metrics and the ability to debug message routes.

          Logging level. In principle all service calls should be “log-able”. Once a system has been stabilized, you will probably want to log only a few key service calls. However, given the need, you may want to increase the detail of logging on demand. Setting up a log-level in each service message will enable the middleware to decide whether or not the threshold for logging requires the message to be logged.

          Caching Ability. This setting works in two ways. From a requester’s perspective, the flag may indicate to a caching entity that under no circumstance should there be a cached response to the request.  From a responder’s perspective, the flag might indicate to the caching entity whether or not the response should be cached.
I recommend that you task your architecture group to define the specifics of an Enterprise Service Framework (ESF) to ensure all your applications generate services with the standard headers you’ve defined. The ESF should be instantiated as a common repository of dynamically linked libraries that are a part of your programmers toolkit; one that will have the appropriate headers transparently appended during the service call.
In the end, the establishment of standard headers under an ESF is a foundational practice necessary to support system-wide dashboard monitoring, preventative systems management and proactive performance planning.

Labels: , , , ,

Friday, January 22, 2010

Caching—SOA on Steroids


One of the most oft heard critiques against SOA is that the overhead of SOAP/XML formats make it intrinsically low performing.  Yes, we all know that standards are often the result of consensus and aren’t always optimized, but SOA’s flexibility is needed to avoid recreating the monolithic “all-is done-here” view of the older development culture.  There is no doubt that SOA architectures can be affected by message transmission delays due to larger message sizes resulting from standardization and overheads associated with modular designs.
So, how to solve this conundrum?
 A common mistake is designing with the idea of avoiding these performance problems “from the start”. The outcome? Designs that are too monolithic, and that introduce inflexible interfaces with tightly coupled inter-process calls “in the interest of performance”.  Talk about throwing out the baby with the bathwater! A better approach is to design for flexibility, as the SOA gods intended, but to introduce the safety valve of caching throughout the system. Caching is the technique used to preserve recently used information as close as possible to the user so that it can be accessed more rapidly by a subsequent caller. Think of caching as a series of release valves needed to ensure the flow of services occurs as pressure-free as possible from beginning to end.
The idea is to design a system that allows as many caching points as possible. This does not mean you will actually utilize all the caching points. Ironically, there is a performance penalty to caching and you should therefore make certain to follow these tenets when it comes to its use:
·         Ensure that the caching logic operates asynchronously from the main execution path in order to avoid performance penalties due to the management of the cache.
·         Ensure you use the appropriate caching strategy. There are several different strategies that apply to specific data dynamics.  Should you clear the cache based on least-used, oldest, or most recently added criteria?  Will you implement automatic caching space recollection techniques (i.e. have a daemon periodically releasing cached elements in the background) or will you do so only when certain thresholds are crossed?
·         The rules for caching should be flexible and controllable from a centralized management console. It is imperative to always have real-time visibility of the various cache dynamics and to be able to react appropriately to correct any anomalies. Use the recommended cache flag field in the message headers to give you more controlled granularity of these dynamics.
·         Allow pre-loading of caching, or sufficient cache warm-up, prior to opening the applications to the full force of requests.
·         Always remember that blindly caching items is not a magic bullet. The success of caching depends significantly on the items you cache. If the items change very frequently, you will have to update the cache frequently as well and this overhead could upset any caching advantages..
Even though there are vendor products that provide single-image views of distributed systems caching, I recommend using them only for well-defined server clusters and not broadly for the entire system. You will be better off designing custom-made caching strategies for each particular service call and data element in your solution. There are several caching expiration strategies, such as time-based expiration, size-based expiration (expiring the oldest x% of cache entries when a certain cache threshold is reached), and change-triggered cache updates using a publish/subscribe mode.
Selecting the right expiration and refresh strategy is essential in ensuring the freshness of your data, high hit cache ratios (low cache ratios can make overall system performance suffer because of the overhead incurred in searching for a non-existing item in the cache), and avoidance of performance penalties due to cache management. Also, if you can preserve the cache in a non-volatile medium in order to permit rapid cache restore during a system start-up, then do so.  
Clearly, choosing what data to cache is essential. Data that changes rapidly or whose precision is critical should not be cached (e.g. available product inventory should only be cached if the amount of product in the inventory is larger than the amount of the largest possible order). You’ll need to assess how fresh data must be, for any situation. The optimum strategy must be determined carefully via trial-and-error. You can also apply analytical methods such as simulation (see later) to better estimate the impact of any potential change to either the characteristics of the data being cached, or the preferred caching approach.
Finally, I can’t emphasize enough the need for accurate caching monitoring via use of real-time dashboards.  These dashboards are a core component of the infrastructure needed to properly manage a complex SOA system. More on Managing SOA next.

Labels: , , , , ,

Friday, January 15, 2010

Data Mapping & Transformation—Part II



Last week, I outlined the various mapping options. The question I left hanging was this: Which approach is the best one?

My experience is that the transformation should take place as soon as possible. For starters, this means that broker-mediated transformations should be avoided, if possible.  The entity doing the transformation must have an understanding of the business processes being mapped and intermediate brokers usually lack this knowledge.
Best is to establish a canonical (i.e. standard) format and then allow both the receiver and the sender translate their respective formats into the chosen canonical form (performance considerations can be dealt with later).  For example, in the modern world, English can be seen as the standard used by all people—A German can communicate with Japanese in English.  In SOA terms, this canonical form may well be a specific set of XML structures.
If a standard protocol is feasible, you will need to decide whether this format will be a subset (a lowest common denominator) of all formats, or whether you will allow the format to carry functions that exceed the capabilities of either one or both of communicating entities.   If the former, you will be forced to “dumb-down” the functionality; if the latter, you will need to restrict the information conveyed by the canonical format in a case-by-case basis. Still, it’s best to make the standard format as comprehensive as possible. It’s always easier to restrict usage of excess functionality than it is to introduce new features during implementation.
If no standard format is feasible because you can’t control the sender or receiver, then you should adopt either a Sender-Makers-Right or a Receiver-Makes-Right approach. In general, the entity that has the better understanding of the business process should take ownership of the mapping.  For example, if you a tourist in another country and use of a canonical language (aka “English) is not possible, then it behooves you to try to speak their local language (i.e. Sender-Makes-Right). After all, it’s unrealistic to expect the local folks will speak your language. On the other hand, if you are visiting the tourism board in a foreign country then you may reasonably assume someone there might speak your language.
Typically, the Sender has a better understanding of the meaning (i.e. “semantics”) of a request. Consider the example where the requester searches for an employee record using the name. The name is in a structured fashion: LastName, FirstName. The server, on the other hand, expects to get the request with a string that contains the “last_name+first_name” (this is a common scenario when the server is a legacy application). The scenario is obvious (I mentioned this was a trivial example!). The requestor (the sender) should create the necessary string. Building the string is much easier for the sender than it is for the receiver. The sender knows the true nature of the last name, while the server’s logic could fail if it tried to derive the last name from regular expression parsing.  (I can’t tell you the number of times I have encountered systems that assume that DEL is my middle name!) Cleary the simple parser used by such software fails to understand that some last names have a space.
This recommendation still leaves open the question of where to do the mapping everything else being equal. My personal view is that when everything is equal, you should put the mapping logic in the server of the request (i.e. Receiver-Makes-Right), simply because it gives you a centralized, single point of control for the mappings. Relying on a Sender-makes-Right scenario places much of the burden on what could eventually become an unmanageable variety of clients. Also, I do suggest that if you decide for one or the other, that you don’t ever mix the approach. That is, if you decide to do a Sender-Makes-Right, do so throughout the system, or vice versa. The hybrid case with mixing Receive-Making-Right with Sender-Making-Right can make the system far too complex and unmanageable.
The corollary to this discussion is that there is a hybrid approach that I believe provides the most flexibility and solves the great majority of transformation needs: using a comprehensive canonical form combined with a Receiver-Makes-Right for cases where the super-set capability exceeds the receiver’s ability. The logic to this approach is that it is easier to down-scope features than it is to second-guess a more powerful capability.
Consider a typical search application scenario: A client sends a search request and the server then prepares a response which includes the found elements; plus ranking scores related to each item returned. The Sender converts the ranking weight factors from a relational database into a “standardized” ranking score system defined by the canonical form. Now, let’s assume the client (the receiver of the response) is not prepared to get or use this extra information. The receiver simply discards the extra information. The down-scoped information loses some of its value, but the client will still be able to present the search results, even if not in a ranked fashion. As long the key results are obtained no major harm occurs. A future, more competent client will be able to use the ranking information. Note that this approach only works if the information being ignored is not essential to the response. If you have a need to ensure essential information is not discarded, you’ll have to define this information as core to the canonical standards.
Yes. Transformation work is sure to have an impact on performance. Next I will cover a technique used to remediate this problem: Caching.

Labels: , , , ,

Friday, January 8, 2010

Data Mapping & Transformation—Part I


A basic tenet should be to keep transformations to a minimum. However, it is not always feasible to create completely homogeneous systems. Even if one wishes to use standard communication means between systems, the reality is that there will be sometimes a need to handle data and protocol mismatches. We frequently must interface the new system with legacy components, or support a federated environment with differing protocols and presentations. More often than not, the system will need to interface with a third party component that does not use the standard format, semantics or protocols.  The question that now emerges is how best to make these components talk to each other? In what components should we deploy the transformation logic?
There are various schools of thought about how to approach the thorny subject of transformation. We end up with the following alternatives: 
·      Broker makes right
·      Sender makes right
·      Receiver makes right
·      Sender and Receiver use a common (“canonical”) format
Broker-Makes-Right is akin to the United Nations with a diplomat giving a speech in, say, Russian, and then having the various translators translating the language for the listeners.
While this works in the UN because the translators are actual human beings capable of human understanding, when relying on automation, the most you can expect the intermediate to perform is straightforward X-to-Y transformations by following pre-defined mapping rules. Unlike UN translators, automated brokers lack the understanding to intelligently optimize the mapping based on cognition.  Broker mediated transformations only make sense when mapping low-level formatting mismatches. Do you need to convert an integer to a string? No problem. Use an intermediate component. Want to append a null termination to strings coming from A and destined for B? Again, fine.
Then there are cases where it is never advisable to let the broker perform automated conversions. For instance, currency conversions from Euros to Dollars may require the knowledge of specific exchange discounts or the application of exchange fees, or anything else that can be dreamed up by government bureaucracies. These types of “business-based” conversions must be placed in the upper layers that are capable of handling business rules, and not in an intermediate broker.
Sender-Makes-Right advocates argue that the sender of the message should know the capabilities of the recipient and adjust the format and characteristics of the delivery to match the recipient’s capabilities. An analogy is that of a teacher communicating to a group of kindergarteners. The teacher will not use complex words and will make an effort to match her language in a way the children will understand. Sender-Makes-Right assumes the sender typically knows-it-all and has the power to adapt its messages to nearly any recipient.
Receiver-Makes-Right proponents believe there is no way a sender will always know the capabilities and limitations of the receiver. Secondly, they argue that the sender is not necessarily the most powerful component of the system. Receiver-Makes-Right proponents argue that the recipient of a message should be able to extract what is needed and transform, as appropriate, the sender’s format.  Obviously, if the scope of information delivered by the sender exceeds what can be handled by the receiver, it is the receiver’s prerogative to dispose of the excess.  If the sender has less knowledge than the receiver, then it is easier for the receiver to map the sender’s format and complement the needed information through other means. An example is the manner in which Google applies complex heuristics to infer the user requests.
A final form is what I call the Esperanto approach, more commonly referred to as the canonical style. Here, both the sender and the receiver agree to use a common language, and both take responsibility for translating their respective formats into this common standard.
Which is the best approach? Clearly each method has both unique issues and advantages. Nest week I’ll go over the recommended approaches.

Labels: , , , , ,

Friday, January 1, 2010

Data Matching and Integration Engines


Encapsulation of data via data services via Data Sentinel works well when the data is being accessed intermittently and discretely. However, there are cases where the data access pattern requires matching large amounts of data records from one data base to large data volumes in another data base. An example could be a campaign management application with a need to combine the contents of a customer database with a promotion data base defining discount rates based on the customer’s place of residence.  Clearly, the idea to have this service call a data service for every customer record when performing promotional matches would be unsound and impractical from a performance perspective. The alternative, to allow applications to perform direct data base joins against the various data bases is not an ideal one either. This latter approach would violate many of the objectives SOA tries to solve by forcing applications to be directly aware and dependant of specific data schemas and data base technologies.
Yet another example is when implementing data extraction via an algorithm such as MapReduce that necessitates the orchestration of a large number of backend data clusters. This type of complex orchestration against potentially large sets of data cannot be left to the service requester and is best provided by sophisticated front end servers.
Both examples show the need to make these bulk data matching processes part of the service fabric, available as coarse data services. The solution then is to incorporate an abstraction layer service for this type of bulk data join process. Applications can then trigger the process by calling this broadly-coarse service. In practical terms, this means that when implementing the SOA system you should consider the design and deployment of data matching and integration engines needed to efficiently and securely implement this kind of coarsely defined services.  In fact, you are likely to find off-the-shelf products that at heart are instances of Data Matching Engines: Campaign Management Engines, Business Intelligence systems, Reporting Engines servicing users by generating multi-view reports.
Now, using off-the-shelf solutions has tremendous benefits but the use of external engines is likely to introduce varied data formats and protocols to the mix. Non withstanding the ideal to have a canonical data format all throughout, there will always be a need to perform data transformations.  That’s the next topic.

Labels: , , , , , , , ,

Friday, December 25, 2009

The Data Visibility Exceptions

The Data Sentinel is not unlike the grumpy bureaucrat processing your driver’s license application forms. After ensuring that you comply with what’s sure to be a ridiculously complicated list of required documents, it isolates you from directly accessing the files in the back.
While you, the applicant, the supplicant, cannot go around the counter and check the content of your files directly (not legally, anyway), the DMV supervisor in the back office is able to directly access any of the office files. After all, the supervisor is authorized to bypass the system processes intended to limit the direct access to the data.  Direct supervisory access to data is one of the exceptions to the data visibility constrains mentioned earlier. 
Next is the case of ETLs (Extract Transform Loads) of large sets of data as well as its reporting. These cases require batch level access to data in order to process or convert millions of data records and can wreck performance if carelessly implemented. Reporting jobs should ideally run against offline replicated databases; not the on-line production data bases. Better yet is to plan for a proper Data Warehousing strategy that allows you to run business intelligence processes independently of the main Operational Data Store (ODS). Never the less, on occasion, you will need to run summary reports or data-intensive real-time processes against the production database. When the report tool is allowed to access the database directly, bypassing the service layer provided by the Data Sentinel, you will need to ensure this access is well-behaved and that it runs as a low priority process and under restricted user privileges. The same control is required for the ETL processes.  Operationally, you should always schedule batch-intensive processes for off-peak times such as nightly runs.
A third potential cause for exception to data visibility is implied by the use of off-the-shelf transaction monitors, requiring direct access to the databases in order to implement the ACID logic discussed earlier.
A fourth exception is demanded by the need to execute large data matching processes. If there is an interactive need to run a process against a large data base set with matching keys in a separate data base (“for all customers with sales greater than an $X amount, apply a promotion flag equal to the percentage corresponding to the customer’s geographic location in the promotion database”), then it makes no sense trying to implement each step via discrete services. Such an approach would be extremely contrived and inefficient. Instead, use of a Table-Joiner super-service will be required. More on that next.

Labels: , , , , , , , , , , ,