Sunday, May 19, 2013

How to deal with services that don't support concurrency? Offer requests one at a time.

When developing service orchestrations using Oracle SOA Suite, an often encountered problem is dealing with unreliable services. This can be services which cannot handle multiple simultaneous requests (don't support concurrency) or don't have an 100% availability (usually due to nightly batches or scheduled maintenance). One way to work with these services is having a good error handling or retry mechanism in place. For example, I've previously described a fault handling mechanism based on using Advanced Queues (AQ); http://javaoraclesoa.blogspot.nl/2012/06/exception-handling-fault-management-and.html. Using this mechanism, you can maintain the order of processing for messages and retry faulted messages. It would however be better if we can avoid faults. In case a service does not support concurrency (because of for example platform limitations or statefulness), messages will have to be offered one at a time.

If the service has a quick response time, you can make a process picking up messages from a AQ, synchronous and thus have only one running process at a time. This has been described at; http://mazanatti.info/index.php?/archives/81-SOA-Suite-11g-truly-singleton-with-AQ-Adapter.html. It's a recommended read.

In this blog post I'll describe a mechanism which can be used if a synchronous solution would not suffice, for example in long running processes. The purpose of this blog post is to illustrate a mechanism and it's components. It should not be used as-is in a production environment but tweaked to business requirements.

Implementation

I'll describe a database based mechanism which consists of several components;
- A database table holding process 'state'. In this example, CREATED, ENQUEUED, RUNNING, DONE
- A DBMS_SCHEDULER job which polls for changes. In my experience this is more stable then using the DbAdapter to do the same.
- A priority AQ to offer messages to BPEL in a specific order and allow loose coupling/flexibility/error handling mechanisms. In my experience this is very reliable.
- A BPEL process consuming AQ messages and calling the service which doesn't support concurrency. There should be only one running instance of this process at a time.


I've created a process state table which holds the process states and provides state history. I've also created a view on this table which only displays the current state. There is a column in the table PROC_NAME. This corresponds to the subscriber used in the BPEL process.

A database job polls for records every minute with state CREATED. If found and no other processes are in state ENQUEUED or RUNNING, a new message is enqueued. I've split the states ENQUEUED and RUNNING to be able to identify which messages have been picked up by the BPEL process and which haven't. There should only be one process in state RUNNING at a time.

I've created a simple HelloWorld BPEL process. This process polls for messages on the AQ. It picks up the message and informs the database that it has picked up a message (set the state to RUNNING). Next I've stubbed calling a service with a wait of one minute. After the period is over, the state is set to DONE. The process looks as followed;



At the end of this post you can download the code. To run the example however, the database needs to have a user TESTUSER with the correct grants to alllow queueing/dequeueing (see supplied script). Also in Weblogic server, there needs to be a JDBC datasource configured and a connection factory (eis/AQ/testuser) defined in the AqAdapter. You can find an example for configuring the DbAdapter at http://kiransaravi.blogspot.nl/2012/08/configuring-dbadapter-its-datasource.html. Configuration for the AqAdapter is very similar.

Running the example

First you need to create the table, trigger, AQ, package, DBMS_SCHEDULER job. This can be done by executing the supplied script.

To start testing the mechanism you can execute the following;

begin
insert into ms_process(proc_name,proc_comment) values('HELLOWORLD','Added new record for test 1');
insert into ms_process(proc_name,proc_comment) values('HELLOWORLD','Added new record for test 2');
insert into ms_process(proc_name,proc_comment) values('HELLOWORLD','Added new record for test 3');
commit;
end;

This will insert 3 records in the process table. These messages will be picked up in order. For implementations in larger applications I recommend using the PROC_SEQ field in the process table to obtain  required information for processing.

After a couple of minutes, you can see the following in the process state table;


As you can see, the messages were created at approximately the same time. The messages are picked up in order of insertion (based on ProcessId). Also as can be seen from the table, when a process is running (the period between state RUNNING and DONE), no other processes are running; there is no overlap in time.

After processing, the process view indicates the latest process state for every process. All processes are done.


In the Enterprise Manager, three processes have been executed and completed.


AQ in a clustered environment

In a clustered environment you have to mind that in an 11.2 database, AQ messages can be picked up twice from the same queue under load. Since this would break the mechanism, I suggest taking the below described workaround.

Bug: 13729601
Added: 20-February-2012
Platform: All
The dequeuer returns the same message in multiple threads in high concurrency environments when Oracle database 11.2 is used. This means that some messages are dequeued more than once. For example, in Oracle SOA Suite, if Service 1 suddenly raises a large number of business events that are subscribed to by Service 2, duplicate instances of Service 2 triggered by the same event may be seen in an intermittent fashion. The same behavior is not observed with a 10.2.0.5 database or in an 11.2 database with event10852 level 16384 set to disable the 11.2 dequeue optimizations.

Workaround: Perform the following steps:

    Log in to the 11.2 database:
    CONNECT /AS SYSDBA

    Specify the following SQL command in SQL*Plus to disable the 11.2 dequeue optimizations:
    SQL> alter system set event='10852 trace name context forever,
    level 16384'scope=spfile;

Considerations

The mechanism described can be used to avoid parallel execution of processes. Even when the processes are long running and synchronous execution is not an option.

Polling

The mechanism contains polling components; the DBMS_SCHEDULER job and the AqAdapter. This has two major drawbacks;
- it will cause load even when the system is idle
- it allows a period between finishing of a process and starting of the next process

You could consider starting the BPEL process actively from the database (thus avoiding polling) by using for example UTL_DBWS (see for example http://orasoa.blogspot.nl/2006/11/calling-bpel-process-with-utldbws.html). This however requires that the URL of the BPEL process is known in the database and that the ACL (Access Control List) is configured correctly. Also error handling should be reconsidered. The overhead of polling is minor. If a delay of 1 minute + default AqAdapter polling frequency is acceptable, a solution based on the described mechanism can be considered. Also, the DBMS_SCHEDULER job polling frequency can be reduced and the AqAdapter polling behavior can be tweaked to reduce the lost time between polls.

Chaining

Ending the process with a polling action -> initiation of the next message is not advisable since it raises several new questions;
- what to do if there are no messages waiting? having a polling mechanism together with this mechanism might break the 'only one process is running at the same time'-rule
- what to do in case of errors -> when the chain is broken

Retiring/activating

I've tried a mechanism which would retire a process at the start and then reactivate it after completion. This would disallow more then one process to be running at the same time. This appeared not to be a solid mechanism. Retiring and activating a process takes time in which new messages could be picked up. Also using the Oracle SOA API during process execution adversely effects performance.

Efficiently determining the current state

I've not tested this solution with large number of processes. I think in that case I should reconsider on how to keep a process history and get to the current state efficiently in a polling run. Most likely I'd use two tables. One for the current state which I would update and another separate table for the history which I would fill with PL/SQL triggers on the current state table.

Download

You can download the BPEL process here; https://dl.dropboxusercontent.com/u/6693935/blog/HelloWorldAQProcState.zip

The databasecode can be downloaded here (you might want to neatify it if for example you like CDM);
https://dl.dropboxusercontent.com/u/6693935/blog/processstate.txt

Wednesday, May 1, 2013

Cleaning up unused namespaces in Oracle SOA 11g BPEL processes by using a Python script

Composites are often created and after creating, they are changed/expanded to implement functionality or bugfixes. When adding new partnerlinks and variables, removing them, importing new XSD, removing them, etc it often occurs that there are namespace definitions inside for example BPEL processes which are no longer relevant because they are not used anymore. This can adversely effect performance/memory usage and increases the chance of errors when XSD's are changed, removed or added (such as inconsistent duplicate namespace definitions).

I took this issue as a nice opportunity to learn myself a bit of Python. Python is a popular scripting language (http://www.tiobe.com/index.php/content/paperinfo/tpci/index.html) and used by several software vendors such as Oracle for Weblogic server; WLST (http://docs.oracle.com/cd/E14571_01/web.1111/e13813/quick_ref.htm) and ESRI (http://esripress.esri.com/display/index.cfm?fuseaction=display&websiteID=224&moduleID=0) for GIS related programming. There is an official Python tutorial available on http://docs.python.org/3.3/tutorial/ and I've also used http://www.vogella.com/articles/Python/article.html to learn some basics.

First impression of the Python language
- I like the usage of indentation compared to the use of brackets or end statements
- code completion is far from perfect with the PyDev plug-in for Eclipse when using Python 3.3 (I needed to Google a lot for API documentation)
- even without background in Python, you can quickly get something working after reading some tutorials (although I have some experiences with other scripting languages like PERL, PHP, JavaScript. I usually use PERL for my regular scripting needs).

Implementation

Purpose

I wanted to create a script which would cleanup unused namespaces in XML files. I started with a BPEL file as an example (the same example as used in; http://javaoraclesoa.blogspot.nl/2013/04/soa-suite-ps6-11117-service-loose.html. It can be downloaded here; https://dl.dropboxusercontent.com/u/6693935/blog/HelloWorldTokens.zip). The BPEL file had the following contents;

<?xml version = "1.0" encoding = "UTF-8" ?>
<!--
////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
  Oracle JDeveloper BPEL Designer
 
  Created: Thu Apr 11 12:23:10 CEST 2013
  Author:  Maarten
  Type: BPEL 2.0 Process
  Purpose: Synchronous BPEL Process
////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
-->
<process name="CallHelloWorld"
               targetNamespace="http://xmlns.oracle.com/HelloWorld/CallHelloWorld/CallHelloWorld"
               xmlns="http://docs.oasis-open.org/wsbpel/2.0/process/executable"
               xmlns:client="http://xmlns.oracle.com/HelloWorld/CallHelloWorld/CallHelloWorld"
               xmlns:ora="http://schemas.oracle.com/xpath/extension"
               xmlns:bpelx="http://schemas.oracle.com/bpel/extension"
         xmlns:bpel="http://docs.oasis-open.org/wsbpel/2.0/process/executable"
         xmlns:ns1="http://xmlns.oracle.com/HelloWorld/HelloWorld/HelloWorld">

    <import namespace="http://xmlns.oracle.com/HelloWorld/CallHelloWorld/CallHelloWorld" location="CallHelloWorld.wsdl" importType="http://schemas.xmlsoap.org/wsdl/"/>
    <!--
      ////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
        PARTNERLINKS                                                     
        List of services participating in this BPEL process              
      ////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
    -->
  <partnerLinks>
    <!--
      The 'client' role represents the requester of this service. It is
      used for callback. The location and correlation information associated
      with the client role are automatically set using WS-Addressing.
    -->
    <partnerLink name="callhelloworld_client" partnerLinkType="client:CallHelloWorld" myRole="CallHelloWorldProvider"/>
    <partnerLink name="HelloWorld" partnerLinkType="ns1:HelloWorld"
                 partnerRole="HelloWorldProvider"/>
  </partnerLinks>

  <!--
    ////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
      VARIABLES                                                       
      List of messages and XML documents used within this BPEL process
    ////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
  -->
  <variables>
    <!-- Reference to the message passed as input during initiation -->
    <variable name="inputVariable" messageType="client:CallHelloWorldRequestMessage"/>

    <!-- Reference to the message that will be returned to the requester-->
    <variable name="outputVariable" messageType="client:CallHelloWorldResponseMessage"/>
    <variable name="InvokeHelloWorld_process_InputVariable"
              messageType="ns1:HelloWorldRequestMessage"/>
    <variable name="InvokeHelloWorld_process_OutputVariable"
              messageType="ns1:HelloWorldResponseMessage"/>
  </variables>

  <!--
    ////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
     ORCHESTRATION LOGIC                                              
     Set of activities coordinating the flow of messages across the   
     services integrated within this business process                 
    ////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
  -->
  <sequence name="main">

    <!-- Receive input from requestor. (Note: This maps to operation defined in CallHelloWorld.wsdl) -->
    <receive name="receiveInput" partnerLink="callhelloworld_client" portType="client:CallHelloWorld" operation="process" variable="inputVariable" createInstance="yes"/>
    <assign name="Assign1">
      <copy>
        <from>$inputVariable.payload/client:input</from>
        <to>$InvokeHelloWorld_process_InputVariable.payload/ns1:input</to>
      </copy>
    </assign>
    <invoke name="InvokeHelloWorld"
            partnerLink="HelloWorld" portType="ns1:HelloWorld"
            operation="process"
            inputVariable="InvokeHelloWorld_process_InputVariable"
            outputVariable="InvokeHelloWorld_process_OutputVariable"
            bpelx:invokeAsDetail="no"/>
    <assign name="Assign2">
      <copy>
        <from>$InvokeHelloWorld_process_OutputVariable.payload/ns1:result</from>
        <to>$outputVariable.payload/client:result</to>
      </copy>
    </assign>
    <!-- Generate reply to synchronous request -->
    <reply name="replyOutput" partnerLink="callhelloworld_client" portType="client:CallHelloWorld" operation="process" variable="outputVariable"/>
  </sequence>
</process>


The ora namespace was not used in this process but it is specified in the process tag.

lxml

Based on the following I started with lxml; http://lxml.de/api/lxml.etree-module.html#cleanup_namespaces since it had a function to easily clean unused namespaces and on StackOverflow.com there were many posts on lxml. To install lxml for Windows, I had to first download and install it from http://www.lfd.uci.edu/~gohlke/pythonlibs/#lxml

cleanup_namespaces

My first try was the following;

import lxml.etree as et
filename_in='D:\\dev\\HelloWorld\\CallHelloWorld\\CallHelloWorld.bpel'
filename_out='D:\\dev\\HelloWorld\\CallHelloWorld\\CallHelloWorld.bpel.out'
tree = et.parse(filename_in)
et.cleanup_namespaces(tree)
tree.write(filename_out)

In the output I noticed that although my process definition had been cleaned up from

<process name="CallHelloWorld"
               targetNamespace="http://xmlns.oracle.com/HelloWorld/CallHelloWorld/CallHelloWorld"
               xmlns="http://docs.oasis-open.org/wsbpel/2.0/process/executable"
               xmlns:client="http://xmlns.oracle.com/HelloWorld/CallHelloWorld/CallHelloWorld"
               xmlns:ora="http://schemas.oracle.com/xpath/extension"
               xmlns:bpelx="http://schemas.oracle.com/bpel/extension"
         xmlns:bpel="http://docs.oasis-open.org/wsbpel/2.0/process/executable"
         xmlns:ns1="http://xmlns.oracle.com/HelloWorld/HelloWorld/HelloWorld">

To

<process xmlns="http://docs.oasis-open.org/wsbpel/2.0/process/executable" xmlns:bpelx="http://schemas.oracle.com/bpel/extension" name="CallHelloWorld" targetNamespace="http://xmlns.oracle.com/HelloWorld/CallHelloWorld/CallHelloWorld">

Several namespaces were removed which were in use such as the ns1 namespace which was used in a partnerlink definition as part of an attribute value;

<partnerLink name="HelloWorld" partnerLinkType="ns1:HelloWorld" partnerRole="HelloWorldProvider"/>

My conclusion was that using prebuild functions like the above would most likely not help solve this problem. I did not find a way to limit the functionality to specific namespaces.

Determining used namespaces

I tried a different approach; determine namespaces used in the root element and try to find them on different locations in the BPEL file. Then rewriting the root element.

import lxml.etree as et
import copy
filename_in='D:\\dev\\HelloWorld\\CallHelloWorld\\CallHelloWorld.bpel'
filename_out='D:\\dev\\HelloWorld\\CallHelloWorld\\CallHelloWorld.bpel.out'
tree = et.parse(filename_in)
root=tree.getroot()
nsmap=root.nsmap
nsmapnew= copy.deepcopy(nsmap)

#print (nsmap.values())
namespaces=set(nsmap.values())
print ("Namespaces found: "+str(len(nsmap)))
for nsitem in nsmap:
    found=0
    nscount=0;
    if nsitem != None:
        print ("Processing prefix: "+nsitem+" Namespace: "+nsmap.get(nsitem))
        #processing all elements
        walkAll = tree.getiterator()
        for elt in walkAll:
            #check element
            eltns=elt.xpath('namespace-uri(/*)')
            if eltns==nsmap.get(nsitem):
                found=1
                #print("Found namespace as element namespace")
                break
            if str(elt.text).find(nsitem+":") != -1:
                found=1
                #print("Found prefix as part of element text")
                break
            #check attributes
            for attribute in elt.attrib:
                if attribute.startswith("{"+nsmap.get(nsitem)+"}"):
                    found=1
                    #print ("Found namespace as attribute name namespace")
                    break
                if str(elt.attrib[attribute]).find(nsitem+":") != -1:
                    found=1
                    #print ("Found prefix as part of attribute value")
                    break
    else:
        #default namespace not removing
        found=1
    if found==0:
        print("Not found")
        del nsmapnew[nsitem]
print ("Namespaces remaining: "+str(len(nsmapnew)))

new_root = et.Element(root.tag, attrib=root.attrib,nsmap=nsmapnew)
new_root[:] = root[:]

#to add the top level comment
try:
    firstcomment=root.getprevious()
    new_root.addprevious(firstcomment)
except:
    None

tree = et.ElementTree(new_root)

tree.write(filename_out, xml_declaration=True, encoding='utf-8',pretty_print=True) 


This rewrote my XML like I wanted it to. Even if namespaces were used in XPATH expressions in the BPEL file, they were being recognized as being used. One drawback however was the namespace prefix which was added for all the subelements even though these elements were in the default namespace. I considered using a transformation to fix this. This would however remove the comments from the file so I decided not to. The script output was as followed;

Namespaces found: 6
Processing prefix: ns1 Namespace: http://xmlns.oracle.com/HelloWorld/HelloWorld/HelloWorld
Processing prefix: ora Namespace: http://schemas.oracle.com/xpath/extension
Not found
Processing prefix: client Namespace: http://xmlns.oracle.com/HelloWorld/CallHelloWorld/CallHelloWorld
Processing prefix: bpel Namespace: http://docs.oasis-open.org/wsbpel/2.0/process/executable
Processing prefix: bpelx Namespace: http://schemas.oracle.com/bpel/extension
Namespaces remaining: 5


The created output file was as followed;

<?xml version='1.0' encoding='UTF-8'?>
<!--
////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
  Oracle JDeveloper BPEL Designer
 
  Created: Thu Apr 11 12:23:10 CEST 2013
  Author:  Maarten
  Type: BPEL 2.0 Process
  Purpose: Synchronous BPEL Process
////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
-->
<bpel:process xmlns:ns1="http://xmlns.oracle.com/HelloWorld/HelloWorld/HelloWorld" xmlns:client="http://xmlns.oracle.com/HelloWorld/CallHelloWorld/CallHelloWorld" xmlns:bpel="http://docs.oasis-open.org/wsbpel/2.0/process/executable" xmlns:bpelx="http://schemas.oracle.com/bpel/extension" xmlns="http://docs.oasis-open.org/wsbpel/2.0/process/executable" name="CallHelloWorld" targetNamespace="http://xmlns.oracle.com/HelloWorld/CallHelloWorld/CallHelloWorld"><bpel:import namespace="http://xmlns.oracle.com/HelloWorld/CallHelloWorld/CallHelloWorld" location="CallHelloWorld.wsdl" importType="http://schemas.xmlsoap.org/wsdl/"/>
    <!--
      ////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
        PARTNERLINKS                                                     
        List of services participating in this BPEL process              
      ////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
    -->
  <bpel:partnerLinks>
    <!--
      The 'client' role represents the requester of this service. It is
      used for callback. The location and correlation information associated
      with the client role are automatically set using WS-Addressing.
    -->
    <bpel:partnerLink name="callhelloworld_client" partnerLinkType="client:CallHelloWorld" myRole="CallHelloWorldProvider"/>
    <bpel:partnerLink name="HelloWorld" partnerLinkType="ns1:HelloWorld" partnerRole="HelloWorldProvider"/>
  </bpel:partnerLinks>

  <!--
    ////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
      VARIABLES                                                       
      List of messages and XML documents used within this BPEL process
    ////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
  -->
  <bpel:variables>
    <!-- Reference to the message passed as input during initiation -->
    <bpel:variable name="inputVariable" messageType="client:CallHelloWorldRequestMessage"/>

    <!-- Reference to the message that will be returned to the requester-->
    <bpel:variable name="outputVariable" messageType="client:CallHelloWorldResponseMessage"/>
    <bpel:variable name="InvokeHelloWorld_process_InputVariable" messageType="ns1:HelloWorldRequestMessage"/>
    <bpel:variable name="InvokeHelloWorld_process_OutputVariable" messageType="ns1:HelloWorldResponseMessage"/>
  </bpel:variables>

  <!--
    ////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
     ORCHESTRATION LOGIC                                              
     Set of activities coordinating the flow of messages across the   
     services integrated within this business process                 
    ////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
  -->
  <bpel:sequence name="main">

    <!-- Receive input from requestor. (Note: This maps to operation defined in CallHelloWorld.wsdl) -->
    <bpel:receive name="receiveInput" partnerLink="callhelloworld_client" portType="client:CallHelloWorld" operation="process" variable="inputVariable" createInstance="yes"/>
    <bpel:assign name="Assign1">
      <bpel:copy>
        <bpel:from>$inputVariable.payload/client:input</bpel:from>
        <bpel:to>$InvokeHelloWorld_process_InputVariable.payload/ns1:input</bpel:to>
      </bpel:copy>
    </bpel:assign>
    <bpel:invoke name="InvokeHelloWorld" partnerLink="HelloWorld" portType="ns1:HelloWorld" operation="process" inputVariable="InvokeHelloWorld_process_InputVariable" outputVariable="InvokeHelloWorld_process_OutputVariable" bpelx:invokeAsDetail="no"/>
    <bpel:assign name="Assign2">
      <bpel:copy>
        <bpel:from>$InvokeHelloWorld_process_OutputVariable.payload/ns1:result</bpel:from>
        <bpel:to>$outputVariable.payload/client:result</bpel:to>
      </bpel:copy>
    </bpel:assign>
    <!-- Generate reply to synchronous request -->
    <bpel:reply name="replyOutput" partnerLink="callhelloworld_client" portType="client:CallHelloWorld" operation="process" variable="outputVariable"/>
  </bpel:sequence>
</bpel:process>


The ora namespace had been removed from the process attribute. The file had also become smaller. The ora namespace is however a default Oracle namespace and to make sure I didn't break anything, I tried to compile and deploy the altered process. This was successful. I also tested it with more complex processes. The resulting BPEL file was still fully functional.

Possible followups could be;

- expand the script and allow recursively processing of multiple files and filetypes
- determine used and unused namespaces for every element, not just the root element
- link the results of unused namespaces within a project to XSD's which could also be removed from a project if they are not used by any file in the project
- if for example problems start to occur with XPATH expressions after removing default Oracle namespaces; exclude specific namespaces from cleaning
- remove the namespace prefix for the default namespace