Technorati Profile Blog Flux Local IT Transformation with SOA
>

Friday, November 20, 2009

The Data Sentinel


Data is what we put into the system and information is what we expect to get out of it (actually, there’s an epistemological argument that what we really crave is knowledge. For now, however, I’ll use the term ‘information’ to refer to the system output). Data is the dough; Information the cake. When we seek information, we want it to taste good, to be accurate, relevant, current, and understandable. Data is another matter. Data must be acquired and stored in whatever is best from a utilitarian perspective. Data can be anything. This explains why two digits were used to store the date years in the pre-millennium system, leading to the big Y2K brouhaha (more on this later).  Also, data is not always flat and homogeneous. It can have a hierarchical structure and come from multiple sources. In fact, data is whatever we choose to call the source of our information.
Google has reputedly hundreds of thousands of servers with Petabytes of data (1 Petabyte = 1,024 Terabytes), which you and I can access in a manner of milliseconds by typing free context searches. For many, a response from Google represents information, but to others this output is data to be used in the cooking of new information. As a matter of fact, one of the most exciting areas of research today is the emergence of Collective Intelligence via the mining of free text information on the web. Or consider the very promising WolframAlpha knowledge engine effort (wolframalpha.com) which very ambitiously taps a variety of databases to provide consolidated knowledge to users. There are still other mechanisms to provide information that rely on the human element as a source of data. Sites such as Mahalo.com or Chacha.com actually use carbon-based intelligent life forms to respond to questions.
Data can be stored in people’s neurons, spreadsheets, 3 x 5 index cards, papyrus scrolls, punched cards, magnetic media, optical disk or futuristic quantum storage. The point is that the user doesn’t care how the data is stored or how it is structured. In the end, Schemas, SQL, Rows, Columns, Indexes, Tables, are the ways we IT people store and manage data for our own convenience. But as long as the user can access data in a reliable, prompt, and comprehensive fashion, she could care less whether the data comes from a super-sophisticated object oriented data base or from a tattered printed copy of the World Almanac.
How should data be accessed then? I don’t recommend handling data in an explicit manner the way RDBMs vendors tell you to handle it. Data is at the core of the enterprise, but it does not have to be a “visible” core. You don’t visualize data with SQL. Instead, I suggest that you handle all access to data in an abstract way. You visualize data with services and this brings up the need via a Data Sentinel Layer. This layer should be, you guessed it, an SOA enabled component providing data accesses and maintenance services.
To put it simply, the Data Sentinel is the gatekeeper and abstraction layer for data. Nothing goes into the data storages without the Sentinel first passing it through; nothing gets out without the Sentinel allowing it. Furthermore, the Sentinel allows decoupling of how the data is ultimately stored from the way the data is perceived to be stored. Depending upon your needs, you may choose consolidated data storages or, alternatively, you may choose to follow a federated approach to heterogeneous data. It doesn’t matter. The Data Sentinel is responsible for presenting a common SOA façade to the outside world. 
Clearly, a key tenet should be to not allow willy-nilly access to data by bypassing the Sentinel. You should not allow applications or services (whether atomic or composite) to fire their own SQL statements against a data base. If you want to maintain the integrity of your SOA design, make sure to access data via the data abstraction services provided by the Sentinel services only.
Then again, this being a world filled with frailty, there are three exceptions where you will have to allow SOA entities to bypass the abstraction layer provided by the Sentinel. Every castle has secret passageways. I will cover the situations where exceptions may apply later: Security/Monitoring, Batch/Reporting, and the Data Joiner Pattern.
Obviously, data abstraction requires attention to performance, data persistence, and data integrity aspects. Thankfully, there are off-the-shelf tools to help facilitate this abstraction and the implementation of a Sentinel layer, such as Object-Relational mapping, automated data replication, and data caching products (e.g. Hibernate). Whether you choose to use an off-the-shelf tool or to write your own will depend upon your needs, but the use of those tools is not always sufficient to implement a proper Sentinel.  Object-Relational mapping or use of Stored Procedures, for example, are means to more easily map data access into SOA-like services, but you still need to ensure that the interfaces comply with the SOA interface criteria covered earlier. In the end, the use of a Data Sentinel Layer is a case of applying abstraction techniques to deal with the challenges of an SOA-based system, but one that also demands engineering work in order to deploy the Sentinel services in front of the Data Bases/Sources. There are additional techniques and considerations that also apply, and these will be discussed later on.

Labels: , , , , , , , , , , , , , , , ,

Friday, October 30, 2009

The SOA Membrane as the Boundary Layer


Sooner or later it happens to most of us. We grow up and no longer can continue to live in the cocooned environment created by our parents—the comfort and coziness of our youth is gone (except if as a result of the Grand Recession you are obliged to return to your parents home and are forced to experience the George Constanza-like awkwardness of adulthood, but I digress). Either way, we have to enter the real world, a world where people speak the language of credits and debits and where behaviors are no longer governed by Ms. Manner’s etiquette or Mom’s nagging but rather by a set of complex social rules that help us interface with the world. The way we engage with the world, the set of rules we follow, the processes and mechanisms we use to interact with others, the whole cultural context of how to say “please”, or “keep the change”, are equivalent to a boundary layer between us and the rest of humanity.
Having created a suitable SOA system (either homogenous or federated via Enterprise Application Integration tools), we need to enclose it in its own protective cocoon, lest the reckless world outside trample with its internal fabric.  The trick is to prevent what is not wanted from getting in while allowing what is wanted to access the system. Here, biology provides us with an ideal model in the workings of the living cell. Just as the membrane of a healthy cell acts as a barrier against harmful agents and judiciously allows the exchange of the enzymes needed to make the cell work in concert with the rest of the organism, we must maintain an SOA membrane that allows the necessary information exchange to take place while keeping the bad guys out of the system.
In IT terms, the membrane is known as the DMZ (Demilitarized Zone). Frankly, I never cared for this term. A DMZ is a buffer zone designed to keep warring factions apart—a zone void of hostilities. The term is deceiving because, in reality, the DMZ is the area where all access battles are fought. Also, the layer’s role is not to keep warring factions apart but to allow the controlled exchange of participating partners. With the emergence of virtualization approaches such as Cloud Computing, we should take the perspective that the membrane is the region where safe trade occurs. In this area the presentation logic is placed alongside a number of public applications. This is the layer that deals with the Business-to-Consumer (B2C) and the Business-to-Business (B2B) interactions. In this layer you also must perform data transformations for data exchange with external entities.
In engineering terms the membrane consists of an arrangement of technologies carrying the interaction with the external world in each layer of the computing stack, from the security guard manning the entrance to the Data Center to the application displaying the sign-on screens. In the networking layer you have the protocol convertors, VPN gateways, IP routers and load balancers. Further up in the stack, the membrane includes firewalls with the appropriate system-level access passwords and permissions; including specific field-level encryption. Even higher up, the membrane contains the needed data mapping and conversion services. Moving on to the application space the membrane includes spam filters and user-level authentication mechanisms.
Rather than give a subliminal message, let me state it as loudly and plainly as a used car commercial before a Memorial-day sale:  it’s preferable to create the membrane with off-the-shelf technologies rather than to try to develop your own. The capabilities and features needed for this layer are usually standard to the industry, and thus it makes sense to use vendor solutions. In fact, a trend is to have many of the functions needed by the membrane be handled by special-purpose hardware appliances.
Alternatively, if you plan to outsource operations, then let the hosting provider worry about the make-up of the membrane. Still, you have to define the required levels of service and make certain the monitoring tools and processes exist to ensure these levels. Either way, the membrane is a component that’s rapidly becoming commoditized. A good thing too, for this is not where you ought to seek competitive IT differentiation (that is, unless you are one of those hosting providers!).
To sum up, the membrane is not the area to invest in internal development. The challenge is to create and view the membrane as an integrated component that can be managed and monitored in a holistic manner even if it consists of an assemblage of products and technologies. If you are creating a membrane you should focus on sound engineering work and vendor component integration; not software development.
Ultimately, a well-designed membrane should be trustworthy enough to allow some relaxation of the security levels inside the system. Also, a well-designed membrane should be flexible enough to allow support for a variety of access devices.  Once you take care of your system’s membrane you can then focus on what happens inside, where the real work takes place, with the Orchestrators.
This is next. . .

Labels: , , , , , , , , , ,

Friday, October 2, 2009

On the Granularity of Services


You’re seated in a fancy restaurant ready to enjoy a nice gourmet meal.  The waiter shows up with the menu, but instead of a list of entrees and appetizers, you are confronted with a catalogue of recipes. You order a Tuna Tartare as appetizer. The waiter stares at you with a bewildered expression on his face. “Pardon?” he asks. “I’d like a Tuna Tartare,” you insist. He doesn’t understand and it finally hits you, he’s expecting you to guide him through each step of the recipe. “Heck,” you think, this must be some kind of novelty gimmick, like Kramer’s make-your-own-pizza idea in a classic Seinfeld episode, and so you begin the painstaking process of preparing for the appetizer:
“Please get 3 ¾ pounds of very fresh tuna. Dice the tuna into 1/4-inch cubes and place it in a large bowl.” The waiter scribbles furiously. “Got this part, sir, I’ll be right back!” he says as he dashes to the kitchen to begin preparing your order.
Reading from the menu, you continue when he returns by requesting that he combine1 ¼ cups of olive oil, 5 limes zests grated and 1 cup of freshly squeezed lime juice in a separate bowl. He runs back to the kitchen before you get a chance to tell him to also add wasabi, soy sauce, hot red pepper sauce, salt, and pepper to the bowl. . .
You get the idea.  There are different ways to ask for services. Let’s think of a more realistic computer design choice. Say you need to calculate the day of the week (What day does 10/2/2009 falls on?). If you were to define “Calculate-Day-of-the-Week” as a service, then you would be expected to allow this service to run in any computer, anywhere in the world (remember the transparency credo I covered earlier!), and to be reachable via a decoupled interface call.  If you were to answer, “Okay! No problem”, I would have to then ask you whether this is actually a sensible option. What would be the potential performance impact of having to reach out to a distant computer every time a day of the week calculation is needed?
Remembering the definition of services that I provided earlier, you insist that “Calculate-Day-of-the-Week” is definitely a service that provides a direct business value.
For SOA purposes a service represents a unit of work that a non-technical person can understand as providing a valuable stand-alone capability
You can argue that “Calculate-Day-of-the-Week” is in fact a unit of work that the salesperson, a non-technical person, can understand and that she will need to access with her Blackberry. In that case, I would then yield to the argument because you have shown that the calculation has business logic that is relevant to your company.
If, on the other hand, “Calculate-Day-of-the-Week” is needed only by programmers, and there is no requirement for it to be directly accessed by anyone in the business group, then this is something that should be handled as a programming function and not as a service. 
If the reason “Calculate-Day-of-the-Week” is needed is because the calculation is part of a broader computation, say to find out whether a discount applies to a purchase (“10% off on Wednesdays!”), then the real service ought to be “Determine-Discount” and not a day of week calculation. You see, defining what constitutes a service can be somewhat subjective.
Your team should apply similar reasoning when determining services: Calculating the hash value of a field is a function; not a service.  Obtaining passenger information from an airline reservation system is a service, but appending the prefix “Mr.” or “Ms.” to a name should not be considered a service.
Now, to be fair, there will always be those fuzzy cases that will demand your architecture team to make a call on a case-by-case basis.  If obtaining a customer name is needed for a given business flow, then it can be considered a service. However, if obtaining the customer name is part of a business process that is a part of assembling all customer information (address, phone number, etc.) you should really have a “Get-Customer-Information” service so as not to oblige the client to request each information field separately. 
In general, when it comes to services, it is better to start with fewer, coarser services and then move on to less coarse services on a need by need basis. In other words, it’s better to err on the side of being coarse than to immediately expose services that are too granular. It’s ultimately all about using common sense. Remember the restaurant example. When you order food in a restaurant it’s better to simply look at the menu and order a dish by its name.
Finally, even if a function is determined not to be a service, and therefore does not need to be managed with the more comprehensive life-cycle process used for services, there is no excuse for not following best-practices when implementing it. Just as with services, make certain the function is reusable, that it does not have unnecessary inter-dependencies, and that it is well tested. You never know when you may need to elevate a function to become a service.
But most importantly, the secret sauce in this SOA recipe is the interface: both, services and functions must have well defined interfaces.
More on this next week!

Labels: , , , , , , ,

Friday, September 25, 2009

A Sample View of Services in a System


Before I present a sample view of services as applied to a hypothetical airline reservation flow, I would like to cover yet other dimensions to the categorization of services: the manner of the service delivery and its disposition.
In SOA, as in life, for any given service request you will be deciding between these three service delivery patterns:
Asynchronous: The service request is posted by the client with the expectation that the action will take place at the server’s leisure and that the client will not expect a related response from the service call. As the client is not waiting for an immediate response, he/she can continue to do whatever it is that clients do. The service is posted asynchronously and possibly queued up in a wait area until the system (the server) is able to process it. Any response resulting from processing the asynchronous request will also be sent back asynchronously to the client.  The service interface designer is responsible for defining and filling-in a correlation identifier, if there is a need to match a request with its response. An example of asynchronous exchanges is a request for support from a vendor via email, with the expectation (but not certainty!) of a reply sometime later. Implicit in the way asynchronous services are handled is the idea that a queuing system of some sort must exist in the system infrastructure to properly handle the various aspects related to this pattern: How do we ensure delivery of the request? How do we prioritize the handling of the service? In the email example, the mail server takes care of all these details.
Query/Reply: This is the predominant service pattern for transactional systems. Here, a service request is made with the expectation that a response to the request will be given immediately. The fact that the client actually waits for a response gives this pattern very definite sensitivities as far as performance is concerned. If email exchanges are a representative example of asynchronous exchanges, you can think of chat or even telephone exchanges as an example of Query/Reply. Instead of sending an email, you establish a two-way real-time dialogue between yourself and the service provider.
Event Driven. This pattern is also known as Pub/Sub because it relies on the Publish/Subscribe idea. In this pattern the service request is for an asynchronous response that will take place upon the satisfaction of the event criteria. This pattern is typically used in a manner similar to the synchronous pattern in that the calling client need not wait for a response, but on occasion, a design may call for the client issuing the subscription request to sit idly by until a response occurs.

With that covered, let me now display a sample generic service flow, orchestrated from a putative airline reservation application with a client requesting via a natural language interface, all available flights to a chosen destination. The example shows a number of service types interacting to construct the appropriate response. 

Hopefully most of the diagram is self-explanatory. The services shown to the left of the vertical line are meant to indicate those that can be accessed by the outside world and are thus considered to be Access Services. The Natural Language Parse, for example, could well be an external service provided by another company on a SaaS basis. The Update Customer profile could be available to external B2B partners to update the profile as per commercial agreements.  Clearly, the Find Best Fare service could also be made available externally if desired, but the example here depicts a Best Fare Service that is applying internal rules that we do not wish to expose to the world.  The various services could be classified as follows:

CLASS/
PATTERN

QUERY/REPLY
ASYNC
PUB/SUB
Access
NATURAL LANGUAGE PARSER
(Atomic)

UPDATE CUSTOMER PROFILE
(Atomic)


Business
FIND BEST FARE
(Composite)

AVAILABILITY & PRICE
(Atomic)

GET PROMOTIONS
(Atomic)

AIRLINE SPECIAL FARE
(Atomic)

System
ENCRYPT CUSTOMER DATA
(Atomic)






Understanding the attributes of each service will enable you to apply predefined standards for their use. For example, Access services will be expected to provide public interfaces and will be hardened for public use. Access services are also cases where you may have to keep state across multiple service calls. Such state information may have to be kept in non-volatile storage (disk, or replicated cache). Composite services will be allowed to keep some state, but only for the duration of the service call. Atomic services will have highly streamlined execution paths, including avoidance of state.
Ideally, everything would follow an asynchronous pattern in the sense that not having to wait for a response is the most flexible way to optimize system resources and meet service level agreements. However, the reality is that you may need to adjust the overall solution against this ideal. Designing a system to do asynchronous messaging whenever possible may be seen as a desired goal, but fact of the matter is that you will need to use the Query/Reply patterns more often than not, particularly in transactional environments, to ensure prompt responses. Alas, I have witnessed actual designs that attempt to force asynchronous patterns in transactional systems (making everything flow through queues); resulting in very odd behaviors and unsatisfactory performance.
Finally, there is another category of “services” I have not yet defined. These are the services needed to facilitate or simplify the workings of SOA. I refer to these services as meta-services, and they are usually called upon to perform a specific SOA activity such as ensuring transaction integrity, forking the delivery of a service request, routing a service request to an alternate location, and other tasks.  In fact, there are various patterns identified for these types of services, but the most interesting aspect is that a portfolio of meta-services is being bundled and it comprises a large portion of what is rapidly evolving to be a separate SOA middleware enabling layer: The Enterprise Service Bus. 
If you recall the SOA taxonomy that I presented earlier, I will include these Enterprise Service Bus elements as part of the Service Fabric, to be discussed at a later date.

Labels: , , , , , , , , , , , ,

Friday, September 18, 2009

Service Categories

Pursuing the analogy of SOA services mirroring the way civilization is structured around individuals and institutions that provide services to others can help us to understand the value of classifying the classes and types of services. After all, if you think of a repair shop, for example, one can see that the services provided by the receptionist are not the same as the services provided by the repair technician in the backroom, or even the services of the cashier who will collect your payment later on.  You would not normally consider putting the receptionist in a repair role, or the repair technician in the role of cashier. It’s best when every individual performs the role for which they are optimally qualified. 
Then there is at the manner in which the individuals render their services. Some have a role as orchestrators of other people’s work. The shop manager is someone who offers a coordination service that makes the business run coherently. Others perform a specific, specialized role. They do what they do without requiring others help (like those who order you out of the kitchen!). Then there are those whose main role relies on accessing a data repository of sorts. They front end actual information resources.  How much salary you afford and how much value and attention you confer to each, ultimately depends on a combination of all these attributes.
Similarly, with SOA we can define three service classes: Access Services, Enterprise Services, and System Services. 

Access Services are often implemented as wrappers for legacy applications and encapsulate the internal business service logic while taking a role as proxies to the clients. This role usually involves keeping and managing the state of a business flow on behalf of the external client. This is often needed in order to hide proprietary logic from external users. As a consequence, interfaces in this class of “services” will be extremely coarse and therefore somewhat verbose. In travel, for example, the Open Travel Alliance XML protocols tend to be extremely elaborate because of the very high level of service interface abstraction required for diverse companies to interoperate.
If you have control of the actual calling application, then you can implement the orchestration logic directly within them and provided the application runs internally, inside your DMZ scope. They can bypass the Access Services by calling the backend enterprise business services directly. 
Access Services and Enterprise Business Services classes are best defined in terms of their business value—as something easy to explain in terms of the business services they provide. But just as there are business services, there are also system services. 
System Services are the services that support the system and don’t usually have a direct business mapping, even though they indirectly support the business.   Still, remember that, as per our definition, you ought to be able to explain what these services offer to a layman, even if the services are not providing direct business functionality. Just because a service has a system focus, as opposed to a direct business focus, does not mean that the service should be so fine–grained that it simply serves some obscure function which could be better handled via a library or a subroutine.  
All service classes should comply with the guidelines and standards established for service life-cycle, but I suggest that the specific elements of the lifecycle will be different for each service class. Access Services will probably be public and should be normalized to industry standards as much as possible, while Enterprise Services should be governed by your internal architecture group. The likely user of the System Services will be your company’s operation team.

Now, let’s talk about Service Types. . . Just as services can be divided into Access, Enterprise Business, and System Services classes, they also need to be classified based upon their intrinsic roles and in the way they are internally structured so that you can assign their maintenance to the proper development organization.
Services that implement functionality requiring access to other services are known as Composite Services. Services of this type may at times keep a session state, but only for the duration of the execution (having services keep state across multiple service calls is generally not recommended, unless the service is an Access Service as discussed above).  On the other hand, services that provide function without needing to call upon other services are known as Atomic Services.  These services provide coarse-grained functionality in a single-shot.  A specific type of Atomic Services is Data Access Services.  The latter supports one of the key principles in SOA: avoidance of data base visibility from functional services, whether Composite or Atomic, and the interfacing of all interactive data requests via service interfaces. The diagram below gives an example of the kind of service classes and types you would see in a reservation system. 

Next week I will cover the service delivery patterns and I will provide an example of how all these service categories fit into an actual system design. Till then!

Labels: , , , , , , , ,

Friday, September 11, 2009

The Services Layer


Getting back to SOA. . .  Because of SOA’s roots as a reuse technique that evolved from the reuse of functions and libraries, it’s not surprising that there is a degree of debate on what actually constitutes a service. There is a view out there that services are akin to “functions on steroids”. The problem with this perspective is that functions are typically too fine-grained to truly stand alone or to be distributed efficiently. Dynamic libraries, functions, and object classes are not really services for the simple reason that these reuse elements are not designed with the attribute of location transparency. Distributing them in a willy-nilly fashion would have severe performance implications and thus is not recommended practice.



In the SOA taxonomy there is a hierarchy of elements with specific purposes. There are the elements needed to allow code reuse, and then there are services and applications. The primary reuse elements (libraries, functions, and objects) are ideal to allow the reuse of code within an application or a specific service implementation, but with services we are trying to reuse business processing elements; not necessarily code.
In the legacy view, an application is a monolithic piece of code that does everything for you: grabs the data, massages this data, performs calculations, determines the execution flows, presents the information, accepts and validates the user input and so on. With SOA, the application is simply the entity tasked with handling the business workflow, keeping all manners of state, and being the orchestrators responsible for calling the appropriate services. The application drives the sequencing of service calls and interactions with the user for purposes of delivering a clear business function. An application is thus the brain of the solution, but in SOA, the brunt work of implementing each specific business process or data manipulation aspect is expected to be performed by the underlying service fabric; not by the application.
The role of a service in this new paradigm is to provide a discrete function representing encapsulated logic with the following characteristics that must meet many of the transparency tenets discussed earlier:
· Designed and implemented independently from the specifics of the client or requester.
· Movable. The service can be executed anywhere “in the cloud”
· Any service may be invoked by any qualified client, or by any other service, without having to change the implementation of the service.
· Encapsulable. While some services are entirely self-contained, others rely on other services for a portion of their logic.
· Replicable. A Service may be available from more than one server. It can’t be assumed that it is the only allowed instance in the system.
· Interface-driven. Having an interface decoupled from the implementation with the client seeing only the interface and never having to worry about the details of the implementation. This formally defined service interface represents an unbreakable contract with the client.
In this context, it’s easy to see why many other popular reusable objects or components such as Dynamic Libraries, Portlets, and RMI-Callable Objects aren’t really services because they miss one or more of these characteristics.
The key point in describing a service is this:
For SOA purposes a service represents a unit of work that a non-technical person can understand as providing a valuable stand-alone capability.
Admittedly, this definition allows some wiggle room, but it hopefully helps to more crisply define what can and cannot be considered a service. Let me emphasize the “non-technical” bit: if you can’t explain what the services does in non-technical terms then it is probably not a service. “The service calculates the Module 8 checksum of the credit card number,” does not mean much. “The service checks that the credit card number is valid,” sounds like a true service. This discussion is important because maintaining a service life-cycle and properly managing and administering service repositories and deployment can be an extremely expensive and complex proposition.
Creation and management of services follows a very similar lifecycle to the lifecycle of applications, but the types of governance, skills, controls, and rules governing their workflow are different and must be handled by specialized technical staff. For this reason, identifying what does and doesn’t constitute a service is not sufficient. There is also a need to understand the general role of the service.
There’s an anonymous saying, “There are two types of people in this world: those who divide people into two types, and those who don’t.” I confess. I am firmly in the former group. You can extrapolate that quote to: “There are two types of people in this world: those who classify services into several types and those who don’t.”
Knowing how to classify services according to classes, types, and delivery patterns, as I do in the following sections, helps define the framework for their use, their limitations, and their ultimate scope. More importantly, if you plan to establish a well-constructed service portfolio to ensure that all services in the portfolio are created, deployed, maintained, and sunset according to very strict life-cycle rules, then you will do well to identify the degree of control you will establish for each of these components.
Later on I’ll discuss the aspects related to the governance on services, including organization and lifecycle required to develop and maintain services, but for now let’s delve deeper into the classes of services available in our SOA toolkit.

Labels: , , , , ,