Event Handling in BPMN 2.0

This is a continuation of my review of the IBM et al BPMN 2.0 submission to OMG.

The biggest change from BPMN 1.x in the orchestration notation is in the area of event handling. It's mostly positive, but still obviously a work in progress with some very rough edges. One major enhancement is my favorite wish list item, the non-aborting attached event. The proposal provides this for timer, message, conditional (rule), signal, and a new one - escalation - that is a variant of the error event, i.e. thrown from within the activity with the attached event. When the trigger signal is caught, the activity is not aborted but continues, and a parallel thread of the process is initiated on the exception flow.

Non-aborting attached events will be distinguished from the regular aborting kind through the notation, but this is apparently unsettled. In some parts of the spec the non-aborting ones are shown with the circles gray-filled; in other parts they are shown with a tiny multi-instance symbol beneath the trigger icon... crowded, to say the least! According to editor Steve White, the team is not happy with either one and will probably settle on something different. But you get the idea. The distinction between aborting and non-aborting is governed by the cancelActivity attribute of the event.

Another major change, apparently related to making event and compensation handling in BPMN fit better with BPEL, is called an event handler subprocess. I am struggling to understand it, mostly because the examples in the spec intertwine issues of event handling - used all the time in BPMN - and transaction compensation, which is used almost never. The transaction piece has changed quite a bit from BPMN 1.x, and it would be much better if the event handling part of the spec stayed away from compensation.

The spec never clearly defines the term event handler, but it appears to mean a subprocess that handles an attached event, whether aborting or non-aborting. Much of the discussion about event handlers revolves around how the handler accesses activity data. As in the case of embedded subprocess vs call activity (see previous post), such execution details should be secondary in what is supposed to be a modeling notation, but I suspect BPEL considerations ruled in this case.

Event handlers are distinguished in two dimensions: "error handlers" (aborting events) vs "escalation handlers" (non-aborting), and handlers "on the boundary" (attached events) vs "inline" (a new notation). The terms error and escalation handler are terrible, because those handlers apply to other event types as well, and the escalation event comes in both aborting and non-aborting flavors. We need better terminology there. On-the-boundary vs inline has a lot to do with context data, i.e. programmer considerations. So yes, it's confusing.

An inline event handler is represented as an additional subprocess drawn within the expanded subprocess. It begins with a start event and ends in an end event, and is enclosed in a dashed-line box that looks like the BPMN Group artifact with a label on the border. Non-aborting vs aborting inline handlers are distinguished by their start events, subject to the same not-quite-decided issues as the attached intermediate event discussed earler. But the spec is missing needed detail here. For example, it doesn't say that a task cannot have an inline event handler, although the notation would not seem to support one, and it is not clear how it would be represented when the subprocess is expanded on another page of the BPD or in a separate BPD, nor what a collapsed rendition of the handler would look like, or even if it is allowed. So this part needs a LOT of work.

An error event handler (i.e. any aborting event) can be represented by an inline handler, a boundary handler, or both in combination. If just an inline handler, there is no attached event drawn, only the handler subprocess drawn inside an expanded view of the activity. If the event occurs, normal processing of the activity is aborted and the activities in the inline handler are executed. When the end of the inline handler is reached, the process continues along the normal flow out of the activity. This is a big change from BPMN 1.x, but it is the way BPEL fault and event handlers work. A single error event can have both inline and boundary handlers. In that case, the inline handler is executed first, then the process continues on the exception flow out of the event. The spec is vague on whether the boundary event "handler" is actually a specific subprocess or just signifies "continues on the exception flow."

If you've followed me so far, you can probably guess how escalation (non-aborting) handlers work. Inline handlers create a parallel thread within the context of the activity. The same questions about representing handlers in tasks, collapsed subprocesses, subprocesses expanded on other pages or BPDs, etc. apply here as well. Both the handler and the normal activity must complete before the process continues on the normal flow.

On-the-boundary escalation handlers initiate a parallel thread at the parent process level without aborting the activity with the attached event. If both inline and boundary handlers are defined, the inline runs first, then the boundary. With boundary escalation handlers, however, the spec runs off the rails a bit, since there is no defined "handler subprocess" that must complete before the normal activity can complete, just the sequence flow out of the event. The spec mumbles something about "taking care" about merging this sequence flow downstream; it recommends not doing so, and ending it in its own end event. That's just stupid. You should be able to merge it with an OR-join. Again, more work is needed on this part of the spec.

All in all, I like it. Going back to my New Wish List post, BPMN 2.0 supports all four wishes. Number one, the non-aborting attached event, is provided explicitly. Number two, the "activity started" event, is allowed by combining the escalation event with a trigger thrown implicitly by the activity started state change. The spec provides an explicit state model for process activities, and going from ready to started is part of that model. So wish number two is not explicitly part of the spec, but indirectly supported. Wish number three, User Action, is another type of escalation event attached to a User task. Wish number four, Enabled to Finish, is not a new event type after all, but an inline escalation handler. The activity cannot finish until the escalation handler is complete.

So, rough edges and all, I like it a lot.