My NextGen BPMN Validator

In a recent post (Improving BPMN Validation) I described the kinds of errors, not checked by my existing BPMN validator (including itp commerce implementation), addressed by my latest development efforts. This post outlines the approach.

My tool is based on XSLT 2.0. It's not a language used by many developers, but it's not a bad choice if you are trying to generate a report (HTML or XML) from an XML source document. So if your BPMN tool can export BPMN 2.0 XML - or, in fact, any XML that is mappable to BPMN 2.0 XML - XSLT is a good language for model validation. Every BPMN tool I know about that supports validation does it via code that references their own internal object model. The good thing about that is validation is one-click inside the BPMN editor. You can't beat that. The problem with it for me is the validation code is tool-dependent. I'm not a BPMN tool vendor, and to be honest, I'm not sorry about that. That business is commoditizing rapidly, and there are some decent Visio-based, browser-based, and even native iPad/Android BPMN tools coming out that are either free or almost so. And Visio 2013 is just a few months away. So people are going to be using a whole raft of new BPMN tools going forward; I think the market for enterprise BPMN tool and, more important, enterprise BPA Suite, is really up for grabs right now.

Bottom line, the next-generation validator must be BPMN tool-independent.

But then, how to get that one-click experience? This will vary from one BPMN tool to the next, but my approach is to make the validator a web service that takes model XML as input. You just need a macro inside the tool that can save the model as XML and call the web service. OK, it will take a few seconds instead of milliseconds to get the report, but it's one click inside the BPMN editor.

I'm using Saxon commercial edition (schema-aware) as the web-hosted XSLT engine, and a WebRatio-generated template for the web application and associated web service. The WebRatio folks have been super-helpful. I could not have done this without their help.

The only thing left is the XSLT. It requires a three-step transformation (four if the input is not BPMN 2.0 XML, but something like the .vdx from Visio Premium):

  1. Conversion to BPMN 2.0 XML, if necessary.
  2. For each process level, map BPMN 2.0 XML to a canonical object model in XML. All flow elements are mapped to either a start, activity, split, merge, or end node in the canonical model. All canonical elements except split and end have a single nextNode attribute. A split has 2 or more child gate elements, each with a single nextNode. A merge has 2 or more child mergeIn elements, each a nextNode of a single canonical node. There is a single canonical start event. A number of BPMN patterns, including boundary events, multiple start events, multiple outgoing sequence flows, and multiple incoming sequence flows require generation of virtual splits and merges in the canonical model.
  3. Trace through the canonical model, beginning from the start node, and propagate a token to the nextNode. This is analogous to the token used in Petri net analysis, but not the same thing. A split generates a child token, and a merge either propagates all the tokens on its inputs or consumes some of them. In XSLT this requires recursion, each iteration processing a single node of the canonical model. A merge is not processed until there are tokens on all its inputs. If there are no nodes ready to process but some merges still unprocessed, those nodes are loopbacks. This is the trickiest part. I handle them in a special way, and continue recursion until all nodes have been processed. When a loopback node has tokens on all its inputs, you can tell if the loopback topology is legal or an error. At the end of this stage, each node in the annotated canonical model contains references to one or more tokens in the token tree.
  4. Analyze the annotated canonical model to identify:
    1. For each merge, what is the correct gateway type (AND, OR, XOR), and does it match the gateway type (or lack of merge gateway) in the original BPMN?
    2. Are there any end events that should be Terminate? And if you change this end event to Terminate, how does that change the correct merge type above? This last bit is one of the trickiest parts; I have not finished this part yet.
    3. Are there any pairs of end events that are not distinct end states, meaning a process instance could reach both of them? This is a warning not an error. Method and Style says each end event in a process level should represent a distinct end state. Sometimes you cannot avoid having indistinct end states, as in the case of non-interrupting boundary events. But in general, it's best to avoid them if you can.
The auto-correction feature I mentioned in my previous post can't be done by an external web service such as this. It will require partnering with the BPMN tool vendor on a native implementation. I will let you know when the web service is available to try out.