BPEL4People Revisited

Recently I began to hear references to BPEL4People again, and I went back to re-read the main white paper. It?s nowhere close to a spec, but provide a description of a workflow "model" followed by an outline of how that model, in 5 different use cases, will be implemented in the ultimate BPEL4People specification. The first part basically recites the core concepts of traditional workflow software -- activities and tasks, roles and role resolution, deadlines and escalation, worklists and task claiming -- familiar to any bank, insurance company, utility, or government agency who has implemented workflow since the late 1980s.

The second part asserts, but not all that convincingly, that grafting this functionality onto BPEL 2.0 really demands addition of a new activity type to the language along with its associated runtime functionality. Some vendors consider the recitation of workflow functionality in the first part of the white paper to constitute the BPEL-standard "workflow model" and implementation of that model, based on BPEL 2.0, to constitute "adoption" of BPEL4People. But even though BPEL4People is still just a white paper, it will eventually be submitted to OASIS as an actual specification that goes beyond what is possible in "straight" BPEL 2.0.

The reason why BPEL4People is an extension to the standard and not just a special BPEL 2.0 use case is that it calls for a new BPEL activity type -- a People activity -- to be used instead of the normal BPEL Invoke to call human tasks. Adding a new activity type is a big deal, since straight BPEL 2.0 engines won't be able to implement it. Besides, virtually all current BPEL-based BPMSs handle human workflow without a special activity type. So I started pestering the authors at IBM and SAP and eventually was able to get a better understanding of what BPEL4People is really about and why it is needed.

Before describing BPEL4People, here is what I call the "typical" way I believe most BPEL vendors do human workflow today, using standard Invoke. BPEL Invoke is a web service call addressed to a service endpoint, or URL. If the service is long-running, like a human task would be, the calling process waits for a callback message in a Receive or Pick activity, reporting that the human task has completed (successfully or otherwise) and returning result data. But unlike normal services, which appear to the process as stateless "black boxes" described only by their WSDL interfaces, human tasks are complex state machines (e.g. ready, claimed, completed, failed) governed by performer assignment and escalation rules that are conceptually part of the "process logic" but have no home in BPEL.

For that reason, BPMSs today typically provide a task management service -- a single service endpoint ant portType for all human tasks -- plus a web application, typically called a Worklist, through which participants view, claim, and perform their assigned tasks. The process Invoke creates a task in the task management service, which performs task assignment and state management based on calls from the Worklist, process-defined deadlines and escalation logic, and other Invoke parameters. When the task is complete (or failed), the task management service calls back the process with the results.

The "obvious" path of least resistance for standardizing human workflow in BPEL, then, would seem to be standardizing the task management service, or at least its interfaces with both the BPEL process and the Worklist application. Because that would not require a new BPEL activity type, it would work from any BPEL engine. But that's not what BPEL4People does.

The BPEL4People white paper describes five distinct task definition scenarios and their corresponding interaction patterns between process and task [Diagram]. Four of the five are based on the proposed new People activity type. Not obvious from the white paper, but explained to me by the authors, is the assumption that the task interface in all five scenarios is specified by XML input and output parameters, but not necessarily by a WSDL portType. At our recent briefing the authors reiterated that all five are "equally important" and BPEL4People needs to accommodate all of them.

Another point emphasized by the authors in the briefing is that in addition to BPEL engine vendors, task vendors (including application software providers like SAP) hold an important stake in BPEL4People, since the spec is intended to promote interoperability between tasks and engines from different providers.

In Pattern 1, the task is specified inline within a People activity. In Pattern 2, the task is specified inline within the BPEL process, but reusable by multiple activities in that process. In Patterns 1 and 2, the task automatically shares process context and data. The white paper says that with Patterns 1 and 2, processes are fully portable between compliant BPEL4People engines, implying that the task implementation itself (including Worklist, state management, task UI) is a BPEL runtime function.

Context sharing here means that task assignment, for example, has access to names of individual performers of previous tasks in the process, allowing assignment either to the same individual (what the white paper calls "chained execution") or a different individual (what the white paper calls the "four-eyes principle"), depending on the circumstance. Taken for granted in traditional workflow, context sharing is more difficult in BPEL, where tasks are implemented as remote services. BPEL4People provides it in Patterns 1 and 2, but not in Patterns 4-5.

In Pattern 3, the task is defined externally to the process, and callable by multiple BPEL processes. This pattern allows an application vendor (like SAP) to expose its functionality as tasks that can be called from any number of process engines, including third-party. The interaction protocol between process and task is implementation-specific, allowing vendors to add their own state coordination, context-sharing, and other bells and whistles. Pattern 4 is like 3 except that the task interface is explicitly WSDL. That means things like process-task state coordination -- for example, letting the task terminate the process or vice versa -- need additional protocol extension. More on this later.

Pattern 5 is like 4, except that the task is callable from a standard Invoke rather than from the new People activity type. But even Pattern 5 is not really what I described above as the normal way BPMS vendors do it today, using a task management service to mediate the interaction between process and task. In BPEL4People Pattern 5, the process Invokes a WSDL specific to each task, not a common one for the task management service. A motivating use case for Pattern 5, as described by IBM, is the ability to replace a human task with an automated business rule having the same WSDL interface, with no change in the calling BPEL.

Even in the inline examples, Patterns 1 and 2, the People activity is distinct from the task. The activity sends process data to the task input and receives process data from the task output. People activities assign their tasks to generic human roles, which are resolved at runtime to specific users and groups by a variety of methods, both automated and manual. In the BPEL, this assignment is expressed as peopleLinks, in analogy to partnerLinks that bind Invoke to a service endpoint. Activity parameters include priority, affecting task sorting in the Worklist, and deadlines, which can generate BPEL faults.

The task is the implementation of the activity. As described earlier, it has lifecycle states (ready, claimed, completed, failed), but the white paper does not say where or how this state is managed. To handle all five interaction patterns, however, the BPEL engine must be able both to perform task state management itself (Patterns 1-2) and allow it to be performed externally (Patterns 3-5). That seems complicated!

Task properties include priority, deadline, and user interface. Priority has the same purpose as the corresponding activity property, and in case of conflict the activity property wins. Deadline expiration in a task does not generate a fault, but triggers defined notification or escalation logic. User interface just specifies a default method, such as HTML Forms or XForms, for rendering task data, although the white paper suggests that the ability to retrieve and display documents is also anticipated. The white paper says little about task implementation in inline Patterns 1 and 2, but gives the general impression of a single web form or page rather than a full-fledged web application or screenflow. For Patterns 3-5, the task implementation is external and could be as rich as you like.

So how does all this differ from the "conventional" approach of, say, Oracle BPEL Process Manager, which uses a task manager service without the need for a new People activity?

With People activities, human tasks are "first-class" process components; in standard BPEL, they're not. As a matter of principle, I like this. Many of the process model parameters supplied by a business analyst using BPMN have no home in standard BPEL (except as comments); they would be part of the task manager logic, technically external to the process. People activities give them a place in the BPEL itself. That's philosophically satisfying, but is it worth adding a new activity type just for that?

Inline tasks (Patterns 1 and 2) share process context and data "for free." For form-based tasks, this probably leads to simpler design, although tools like Oracle (and probably Intalio) provide wizards that shield the designer from the underlying artifacts of the conventional approach. As for the claim that Pattern 1 and 2-based processes will be completely portable, I'm skeptical. The implementation of task management within the process engine, not to mention all the vendor-specific features and programming constructs in regular BPEL, makes true portability unlikely.

With BPEL4People, processes and tasks can coordinate their state, so that one can terminate the other, or at least request termination. Conventional approaches like Oracle don't do this. The white paper mentions this requirement explicitly in the context of Pattern 4, but it seems equally important in all five Patterns. Possibly you get it for free in Patterns 1 and 2 (but is it required?), and the process-task protocol is explicitly "implementation-specific" in 3 (so that's a punt). To me, this issue is not specific to human tasks, but a general problem with BPEL: BPEL does not have a concept of subprocesses. To standard BPEL, a called subprocess is just like any other invoked service. Any context sharing or state coordination protocol has to be explicitly programmed in both the calling and called process. You certainly don't get it for free by declaring process (or task) B to be a child of process A. But the BPEL4People white paper is silent on what that protocol is. Interestingly, the authors of the white paper submitted another BPEL extension proposal for subprocesses (BPEL-SPE) around the same time. While acknowledging extensive "symmetries" between the subprocess problem and the task state coordination problem, IBM and SAP are still not prepared to say whether BPEL4People will rely on BPEL-SPE or implement its own variant of it.

Pattern 5 is intended to allow a human task to be replaced with a business rule without changing the calling BPEL. When I wrote my report on WebSphere BPM Suite, IBM made a big deal of that capability, and now we see it in BPEL4People. To use it, you have to use standard Invoke, which means the vendor needs to provide all the role assignment, task deadlines, and other human workflow features even without a People activity! But beyond that, it seems this feature only works if the portType of the rule service and the task service are identical. IBM can probably make that happen with its own WebSphere rule engine, but will that work as well with ILOG, Blaze, or Corticon? Somehow I doubt it.

Will BPEL4People succeed? If the spec -- which we still may not see for some time -- insists on embracing all five interaction patterns to be considered "compliant," it probably limits the implementer list to a few well-funded organizations like, well, IBM and SAP and their partners. On the other hand, the vendors who care the most about human tasks are mostly BPEL-haters, and among the BPEL-lovers only IBM and SAP have stepped up to the plate with a proposal.

Obviously there is a need for some standardization of human interaction in BPEL. Standardizing the portType and basic features of the task management service (both its process and Worklist interfaces) seems to me a more practical approach, perhaps adding something along the lines of a lifecycle state coordination protocol.

What do you think about BPEL4People? The IBM and SAP people are still working under wraps, but I think they're ready to hear some critique from readers.