Wednesday, 6 January 2016

Analysing Asynchronous Transactions and the Delivery Service in SOA Suite 11g

Among the many features provided by Oracle SOA Suite, I believe asynchronous message handling is one of the most powerful and important capabilities on offer.  It enables us to have long running transactions via the dehydration store, rich error handling via the error hospital and callbacks via WS-Addressing among others.

Unfortunately, while this capability is very powerful, I find it is rarely fully understood, especially when considering performance and transactionality.  Furthermore, understanding what happens 'under the hood' can help us to design better, more performant and resilient BPEL processes.

In this blog I will detail how the delivery policy plays a pivotal role in asychronous message exchanges and how it can impact the design of SOA services and integrations.  I will use a simple example of using a number of composites to integrate two databases, but the detail of each composite is not important.

Asynchronous One Way via the Delivery Service


















The diagram above illustates how threads and transactions are managed in the default configuration.  By default, SOA Suite inserts all new messages (whether from JCA adapters or web services) into the delivery service before they are processed by BPEL to ensure there is no message loss, even on a server failure.  As messages become available in the delivery service, a set of threads known as invoker threads pick these messages from the service and deliver them to the BPEL engine for further processing.  In addition, each BPEL process instance and subsequent insert back into the delivery service is performed in separate transactions.

This approach has the following advantages:

1. No message loss, even if the server were to fail mid-process - the message is stored in the delivery service ready to be processed when the server becomes available again.  Assuming that the BPEL process interacts with external data sources in an XA transaction, no commit will have taken place prior to the failure and therefore the message can be re-processed.  Note however, that special care must be taken when working with non XA resources such as external web services.  These services will commit within their own transaction, and re-processing a message can lead to duplicate data or primary key violations.

2. Scalability.  Placing a message in the delivery service allows messages to be processed when a thread becomes available as opposed to executing immediately in the thread that created the composite instance.  This has the advantage of allowing for fine grained thread control.  For example, one could restrict or increase the number of invoker threads available on the server to control the throughput of asynchronous transactions - messages can remain queued in the delivery service until a pooled invoker thread becomes available.

However, there are also some disadvantages:

1. Inserting into the delivery service provides no guarantee on when the message will be picked up and processed.  With sufficient server load and many running instances, it may be a number of seconds before an invoker thread is available.  This has obvious implications for low latency applications - a message may not arrive at its destination in the timeframe set out by an SLA.

2. Inserting into the delivery service requires additional database persistence; this is because before a message can be processed by the BPEL engine, it must first be stored in the dlv_message table.  Of course, this approach also has an advantage (no message loss) as described above.

3. 'Context switching' between pooled invoker threads happens more frequently because the thread responsible for inserting a message into the delivery service may not be responsible for processing it.

4. As a message traverses through a number of composites as part of an integration, these composites may contain an arbitrary number of activities such as interacting with file systems, web services and databases etc.  As each composite completes, a commit will take place.  This may be adequate for some software systems, but for systems which require an atomic transaction across the entire integration,  this can lead to data inconsistencies if a server were to fail or an error causes a process to go to manual intervention.

Asynchronous One Way with In-memory Delivery Service


















To reduce the level of persistence and increase overall message throughput, it is possible to place a message into an in-memory queue as opposed to a database table before it is delivered to a BPEL process.  However, while this configuration provides increased performance, messages can be lost if the SOA server were to fail mid-transaction.  In addition, work performed by each composite is committed independently (i.e. non-atomic), again resulting in data inconsitencies should the server fail.

Given these disadvantages, this is not a recommended approach.

Asynchronous One Way with no Delivery Service


















An alternative configuration is to remove the delivery service and force the integration to execute as it would in a synchronous scenario - where a single thread is used throughout, with each composite inheriting the thread and using it to process the message.  From a transaction point of view, a new transaction is created at each composite enty point, any existing transaction is suspended until the child transaction has completed.

The primary advantages of this approach are that the delivery service persistence is reduced and the number of thread context switches is minimised, both leading to a higher transaction throughput in a number of use cases.  If your application interacts with low latency resources such as databases, JMS or Advanced Queues, then this can be a very useful configuration.

The disadvantage however, is that managing these separate transactions becomes more complex and requires careful design.  For example, if a child composite encounters a fault, you have a choice to handle the fault within the child or let the fault bubble up to the parent composite (which would not be possible using the delivery service).  If the fault is handled in the child composite, a commit will take place as normal and execution continues.  If it is not handled, the child composite will roll back and a FabricInvocationException will be returned to the parent composite.  The parent then has the choice to handle the fault locally or bubble it up to another parent composite if it exists.

In addition, a further issue can arise if your application interacts with other, high latency services.  When you set the delivery policy to synchronous, you are effectively coupling any polling thread to the processing of a message, which can lead to lower throughput if the thread is blocking waiting for a reply from another service.

In summary, removing the delivery service can have a profound impact on throughput and may be the configuration of choice for high volume environments.  However, a designer must understand the finer details of a process to ensure that transactions and fault conditions are handled appropriately.  In addition, one must also understand the impact of using a polling thread to process a message, as opposed to delegating it to an invoker thread (i.e. using the delivery service).

Asynchronous One Way (No Delivery Service, Inherit Transaction)


















A slight variation on the approach above is to continue without a delivery service, but ensure that the entire integration is performed in a single transaction.  This approach maintains all of the performance advantages, but can also make the design simpler because transactions do not have to be managed locally in each composite.

For example, should a fault occur in a child composite, the handling of this fault will directly impact the transaction state of all composites within the flow.  If the fault is handled, all XA resources in each composite will commit - if it is not handled, then all resources will rollback.

I should point out however, that in our example of a polling database adapter as the message source, if a fault occurs, the message will be propogated back up to the adapter if it is not handled in any of the composites.  The side effect of this is that by default, the database adapter will attempt to retry the transaction many times, leading to the creation of additional instances until the transaction succeeds.

One may conclude that this must be the best approach, and in many use cases it is, mainly because of the single, atomic transaction.  However, there are a number of other scenarios which require the transaction to be split between composites.  A simple example would be that of logging - if we have a requirement to persist log messages to a database table, we would want those messages to commit regardless of the state of the business transaction.  We could achieve this by using a separate logging composite, with the transaction parameter set to 'requiresNew'.

In summary, there is no definitive configuration that will suit all integrations styles.  One must first understand all of the application requirements and SLA's before the most appropriate configuration is chosen.

No comments:

Post a Comment