Technorati Profile Blog Flux Local IT Transformation with SOA
>

Friday, December 25, 2009

The Data Visibility Exceptions

The Data Sentinel is not unlike the grumpy bureaucrat processing your driver’s license application forms. After ensuring that you comply with what’s sure to be a ridiculously complicated list of required documents, it isolates you from directly accessing the files in the back.
While you, the applicant, the supplicant, cannot go around the counter and check the content of your files directly (not legally, anyway), the DMV supervisor in the back office is able to directly access any of the office files. After all, the supervisor is authorized to bypass the system processes intended to limit the direct access to the data.  Direct supervisory access to data is one of the exceptions to the data visibility constrains mentioned earlier. 
Next is the case of ETLs (Extract Transform Loads) of large sets of data as well as its reporting. These cases require batch level access to data in order to process or convert millions of data records and can wreck performance if carelessly implemented. Reporting jobs should ideally run against offline replicated databases; not the on-line production data bases. Better yet is to plan for a proper Data Warehousing strategy that allows you to run business intelligence processes independently of the main Operational Data Store (ODS). Never the less, on occasion, you will need to run summary reports or data-intensive real-time processes against the production database. When the report tool is allowed to access the database directly, bypassing the service layer provided by the Data Sentinel, you will need to ensure this access is well-behaved and that it runs as a low priority process and under restricted user privileges. The same control is required for the ETL processes.  Operationally, you should always schedule batch-intensive processes for off-peak times such as nightly runs.
A third potential cause for exception to data visibility is implied by the use of off-the-shelf transaction monitors, requiring direct access to the databases in order to implement the ACID logic discussed earlier.
A fourth exception is demanded by the need to execute large data matching processes. If there is an interactive need to run a process against a large data base set with matching keys in a separate data base (“for all customers with sales greater than an $X amount, apply a promotion flag equal to the percentage corresponding to the customer’s geographic location in the promotion database”), then it makes no sense trying to implement each step via discrete services. Such an approach would be extremely contrived and inefficient. Instead, use of a Table-Joiner super-service will be required. More on that next.

Labels: , , , , , , , , , , ,

Friday, November 6, 2009

The Orchestrators


Back in the XIX century (that’s the 19th century for all you X-geners!), there was a composer who didn’t know how to play the piano. In fact, nor did he know how to play the violin, the flute, the trombone, or any other instrument for that matter. Yet, the man managed to compose symphonies that to this day are considered musical masterpieces. The composer’s name was Louis Hector Berlioz, and he achieved this feat by directing the orchestra through each step of his arrangement and composition. His most recognized work is called “Symphonie Fantastique” and, according to Wikipedia, the symphony is scored for an orchestra consisting of 2 flutes(2nd doubling piccolo), 2 oboes (2nd doubling English horn), 2 clarinets (1st doubling E-flat clarinet), 4 bassoons, 4 horns, 2 trumpets, 2 cornets, 3 trombones, 2 ophicleides (what the heck is an ophecleide? A forerunner of the euphonium, I found out. What the heck is a euphonium? Well, check it out in Wiki!), 2 pairs of timpani, snare drum, cymbals, bass drum, bells in C and G, 2 harps, and strings.
By now, you probably get the idea. Mr. Berlioz fully exemplifies the ultimate back-end composite services element: The Orchestrator. Berlioz composed some pretty cool stuff by knowing a) what he wanted to express, b) what specific set of instruments should be used at a particular point in time, and c) how to communicate the notes of his composition to the orchestra.
Every SOA-based system needs its Berliozes.
There are several dimensions involved in defining the role of an orchestrator for SOA. First, as discussed earlier, most orchestrator roles will be provided within the context of an application; not as a part of a service. That is, the orchestration is what defines an application and makes one application different from another. The orchestration is the brain of the application, and it is the entity that decides the manner and SOA services calling flow.
In some instances, you might even be able to reuse orchestration patterns and apply them across multiple applications. Better still, you can build orchestration patterns by utilizing the emerging Business Process Modeling technologies (BPM). BPM simplifies the work of creating orchestration logic by providing a visual and modular way of assembling orchestration flows. A small commentary of mine is this: BPM is not SOA, but BPM requires SOA to work properly.
An apropos question is to ask how much orchestration should be automated in the SOA system as opposed to letting the user manually orchestrate his or her own interactions. To answer this question it is best to remember the complexity rule I stated earlier:  the simpler the user interaction; the more complex the system, and vice-versa. 
Then again, there are limits to the complexity of an orchestration. A full-fledged Artificial Intelligence system could become the ultimate orchestration engine but, unfortunately, such a machine remains in the realm of science fiction.  Cost-Benefit compromises must be made.
Say we have a travel oriented system and need to find the coolest vacation spots for the month of September. Should we let the user manually orchestrate the various steps needed to reach a conclusion? Each step would indirectly generate the appropriate service calls for searching destinations, filtering unwanted responses, obtaining additional descriptions, getting prices, initiating the booking, and so forth. Or we could consider developing a sophisticated orchestration function that’s able to take care of those details and do the hard work on behalf of the prospective traveler. But should we do it?
The answer lies in the size of “the market” for a particular need. Clearly, there is a need for a travel orchestration capability that can take care of all the details mentioned. After all, isn’t this why Travel Agencies emerged in the first place? If the orchestration is need by only a few users, then it is best not to spend money and effort attempting to automate something that is too unique. On the other hand, if the request becomes common, then it is preferable to create an automated orchestration function that organizes and integrates the use of SOA services.
The orchestrators design should always accommodate the transparency tenets in order to allow horizontal scalability. In other words, if you provide the orchestration via servers located in the system membrane, you will need to design the solution in such a way that you can always add more front end servers to accommodate increased workloads, without disrupting the orchestration processes in existing servers. Because orchestration usually requires the server to maintain some form of state, at least for the duration of a transaction, you will need to incorporate some form of session-stickiness in the orchestration logic. Later on, I will write more about why I recommend that this is the one and only area where a “session state” between the user and the orchestration should exist, even as I still advice to keep backend services discrete and sessionless.

Labels: , , , , , , , ,