The Portability Dialogues

Assaf:

I want to correct a misconception. I don't want to visually model BPEL, I happen to like BPMN much better for all the same reasons: higher level, simpler, more business like. I just want to make sure BPMN has an execution path. And I learned that hard way we can only do that by picking an execution language. I happen to prefer BPEL over XPDL.

I don't believe in the "either BPMN or BPEL" dichotomy (i.e. you can't do high level and get something to execute), but others apparently do -- or for product reasons need to -- and tried to turn my statement into a polar view point.

There's two things BPMN without BPEL can't do right. One is execution. If I can't say "order-total > 500" in the process, then having a quick approval path and one that requires extra checks doesn't matter. Because without that rule, the process can't decide which path to choose. It's stuck. And BPMN has no way for expressing these rules. What you can't express, you can't port...

Portability is a good goal. Now, imagine an execution engine that does compensations very well. You design a process to get quick orders in, and allow customers to cancel. But if your execution engine doesn't do compensation at all (or badly), you have two options. Write a lot of complex code to get around that. Or more likely, write a process that asks for second confirmation to make sure the cancellation rate is low...

How well your business works, and what it does, depends a lot on how it does those things. Implementation details do matter, and do make the difference between Amazon and Borders, Wal Mart and K-Mart. (Guess which has a better IT system?) Assaf again:

There's one view in the software world that says all dependencies are bad and we should abstract them away. There's another one that's willing to take some risks in order to get better results. I belong to the second one.

If you can build a great notation that's independent on the execution language, then you might as well pick one existing language as serialization format. Because it wouldn't matter. Let's say we pick BPEL + visual notation extensions as serialization language. We solved one problem. Since the notation is independent, you use that to serialize (store, retrieve, exchange) your process models, but you can easily deploy XPDL or any other language, because the executables are all equivalent.

Dependency only happens when those execution languages are not equivalent, and so you end up designing your process based on the capabilities of the execution. All of a sudden, what you designed for BPEL doesn't work the same way for XPDL. That's a dependency problem.

And I don't want to solve it...

CircuitCity has three different sales process. You can order on the Web items you'll never find in the store (the Web shelf space is inifinite). You can order on the Web, and they have great Web discounts, and pick up at the store. Or you can buy at the store without discount. The Web discount reflects the cost of the "implementation detail".

The abstraction camp would like to believe that every company can have a single order process that's properly abstracted from implementation details. The dependency camp would like to make it easier to define several processes that are somewhat married to implementation details, but also tuned to those details to get better results. Bruce:

I may be talking out of my hat, but from the work I've done with BPEL I would not recommend it as the serialization format. My biggest beef is how it handles data. Defining all variables as wsdl message types, all that bpws:getVariableData() etc. Yuk. Plus the whole idea that BPEL doesn't update process variables, it just keeps copies of messages and endlessly maps them to each other... That's fine for BPEL but many non-BPEL languages don't work that way. So that's the part I want abstracted away.
Assaf:
You're right. BPEL 1.1 was designed to be an orchestration language, that's why it acts as a poor man's process language... Also, why Intalio could not ship a BPEL 1.1 engine to its customers, it just didn't solve problems they had to deal with.

BPEL 2.0 is a different beast. Variables are now real variables, you can use elements, complex types, simple types. You can use variables to calculate, count ( e.g. the foreach activity), maintain lists and collections. getVariableData() is gone, in favor of the XPath syntax (one of the first issues I opened), XSLT support is built in the spec, XQuery is possible although not normalized in the spec due to release deadlines.

You can declare variables with initial values, configure those values at deployment (see latest SCA release). You can scope variables, validate their values against the schema. You can even send an element variable directly, instead of having to create a WSDL message to contain it.

It took a long time to get from 1.1 to 2.0, and there's a reason for that, and it shows in the spec.

Now, I'm the last to say BPEL 2.0 is perfect. It's not. But I can't imagine any vendor or standard group trying to recreate the same specification from scratch and able to deliver something in a reasonable amount of time. Just dealing with data variables so they can meet real use cases without jumping through hoops, is a huge undertaking.

BPEL 1.1 was the lowest common denominator, BPEL 2.0 is not, so abstracting means losing all sort of interesting capabilities. Bruce:

OK... but doesn't it still replicate the variable under a new name every time it's used? In BPMN, a variable (let's call it that even though I think they call it something else) like "Customer" should be used throughout the model, not mapped to a different Customer variable name and namespace with every activity that uses it. Does BPEL 2.0 do that? All that Assign mapping should be under the covers, not in the BPMN, unless it's "complex" in which BPMN should support a way to define it. I think XPath 2.0 is OK (but is that really a "standard"?)
Assaf:
Variables are mutable in BPEL 2.0, so you can keep using the same variable and just change its value.

Most XPath 1.0 libraries have been upgraded to do XPath 2.0, it's not that hard, and products are being released with XPath 2.0 support, so it's just a matter of getting through the normal release/upgrade/install cycle before everyone has it. XQuery is the more interesting alternative, but much further down the road.

Once you get down to details but don't have assignments, how can you move a process definition from one tool to another?

It really makes a difference whether you're basing a decision on quantity or price, whether you're sending billing or shipping information to accounting. In fact, in some cases what you send is more interesting than who you send it to (you decide on the service to use based on the request you want fulfilled). Bruce:

Of course BPMN needs a way to define gateway conditions, but better to do it "abstractly" than with namespaces and other trappings of BPEL. This inevitably leads back to the question of how much data modeling can/should be done by the "analyst" as opposed to "developer" (analyst being Ismael's definition of someone who doesn't know a while loop from a hula hoop)...
Assaf:
...If we have one modeling language with clear path to execution, you can start with the high level definition during project planning, then incrementally extend it until eventually you get something that software can manage. And deploy that. And from that level of details you can always extract a high level view of what's currently happening in the business. If you get data from your servers, you can understand how it relates to your processes.
If you decide to use two modeling languages, you need to break the flow and switch at some point, so you end up with two different models. And they get out of sync, they don't always have the same understanding, change is more complex, so change happens less often. We like to solve that problem with more agility. It's not about teaching people how to program but making sure there's good flow from your organizational decisions down to the flow of bits and bytes. As best as could be.
Bruce:
One more thing... You keep saying BPMN has no way to specify data expressions in things like gateway logic. I'm thinking that can't be true, so I'm now looking at the BPMN 1.1 spec dated 7/31/2005.

On p 83 re data-based exclusive gateways: "The set of Gates for Data-Based Exclusive Decisions is based on the boolean expression contained in the ConditionExpression attribute of the outgoing Sequence Flow of the Gateway.... The conditions for the alternative Gates should be evaluated in a specific order. The first one that evaluates as TRUE will determine the Sequence Flow that will be taken."

But then: "Note: BPMN does not specify the format of the expressions used in Gateways or any other BPMN element that uses expressions." So it would be fairer to say BPMN should specify some standard expression language such as javascript or perhaps XPath. I think insisting on BPEL expressions is unnecessary. Assaf:

[BPMN conditions are] nice for people, we understand what "all orders over 50 dollars" mean, but software doesn't. So those expressions don't buy you anything when you're asking software to manage them.

Notice order of evaluation, that's important for software, something you'll need in BPEL. So by doing that, BPMN is going forward some of the way, but not all of the way.

BPEL specified XPath. That was the easy part.

Then went through months of work to fix a lot of issues that came up because people use different data formats, so there's a lot of issues around context nodes, variable mapping, etc. Not to mention, in a process that can execute different activities in parallel, when can you access the data? What happens if the data is not there? What if you're recovering from or compensating for failure?

Then through more months of work to make sure it also applies to XPath 2.0 and XQuery 1.0, so we can use those when the time comes.

So doing "BPEL" is actually doing XPath/XQuery and getting to benefit from all that work on resolving issues that arise when you use XPath in a process that may perform activities in parallel, could have repeating activities, synchronization barriers, failure conditions, etc.

Doing just XPath means having to spend additional research to figure out how XPath can be used in a process that may perform activities in parallel, could have repeating activities, ....

If BPMN can do that quickly and easily, I'm impressed, I'll admit to being wrong and take back my arguments.

But meanwhile, I wonder where's the value to justify re-inventing the wheel just so we can call it "rolling circle", and are customers really asking to wait a few more years so the spec they receive does not leverage one that already exists today?