[This is a re-post of something I wrote yesterday on the SAP Business Process Expert megablog, in case you don’t follow that site.]

At the recent Gartner BPM Summit, I was shocked to see how high a pedestal the Gartner analysts now place simulation analysis in their gallery of must-have BPM capabilities. Ever obedient, the BPMS and modeling tool vendors now universally throw it into the box. How else to get into that Magic Quadrant?

But have these analysts ever really used these tools, or even scrutinized them closely? I’m not really sure. I haven’t looked at all of them myself, but my sampling to date tells me this is a fake feature if ever there was one.

In my training Process Modeling with BPMN, I wanted to add a big section on how to leverage the cool parts of that notation — events and exception handling, delays and batching patterns, etc — to project process performance using simulation. I know simulation isn’t part of BPMN — the spec not only doesn’t standardize simulation parameters but doesn’t even have the barest concepts of resource pool modeling — but it is a nearly universal feature of modeling tools that use BPMN. So that was my starting point.

Since the simulation tutorials of the tools I looked at — where they existed at all — didn’t address anything close to real problems of process analysis, I had to make up my own methodology and then apply it to the notation. It was in the course of doing that that I concluded that most modeling tools were missing four basic ingredients needed to make simulation real, not fake:

1. the right simulation parameters
2. the right calculation of standard metrics
3. the raw instance data needed to create custom metrics and reports, and
4. a methodology for translating the kind of performance analysis that’s supposedly possible into the parameters and outputs provided.

Let’s go through this, and if you still think your simulation tool isn’t fake, chime in. I should mention that the tool I wound up using in the training, Process Modeler 4 from itp commerce — one I touted here a couple weeks ago — was not only the closest to what I needed, but the vendor was willing to tweak the features in several places to try to get it right. You can’t ask for more than that.

Let’s start with what kind of predictive performance analysis are you expecting from simulation? Two things came immediately to my mind.

The first is the kind of analysis most folks have in mind when they document their as-is processes and sit around looking at the diagrams discussing why it takes so long, costs so much, and is so error-prone. Classic BPM “theory” says that the problem is in the handoffs, typically between roles or departments (swimlanes in BPMN). The improved to-be model has fewer yo-yoing handoffs between swimlanes, less unnecessary rework or pointless reviews and approvals. Thus, simulation use case 1 is to quantify the improvement. How much faster, how much less cost, etc? In this kind of analysis, remember, the problem isn’t about contention for resources — too few widget-tweakers at step 17 — but about inefficiency in the structure of the process itself.

The second is the kind of analysis that most BPMS vendors promote, particularly those focused on workflow automation. There it’s all about “bottlenecks” — too few widget-tweakers at the widget-tweaking step — possibly exacerbated by the fact that the same pool of widget-tweakers is also supposed to handle the widget-flipping task at step 19. We call this “contention for resources.” There the simulation is about creating various what-if scenarios with different numbers of widget-tweakers to see the bottlenecks disappear.

You don’t need to be a rocket scientist — you could even be a widget-tweaker — to know that if you configure enough widget-tweakers the bottleneck at step 17 will go away. So what? The issue is at what cost. So this analysis is about looking at the cost/time tradeoffs.

There is a variation of this type of analysis called activity-based costing. Here the idea is that different classes of work running through the process incur different costs. Simulation analysis by itself only gives you the direct or variable costs of each type. Activity-based costing lets you distribute the fixed overhead and other indirect costs as well in a “fair” way. In the newer forms of the analysis, called time-driven ABC, you can get everything you need from the simulation output, so it’s really an extension of the previous resource utilization use case.

So both of these simulation use cases (or all 3, if you count ABC separately) are important. But they’re different. The way you have to set up the simulation parameters is different, the standard outputs are different, and the custom reports you need are different as well. I worked through all 3 of these in my BPMN training, but my post here is about the simulation tools, not the training.

In the interests of time, here are the features I discovered I wanted in order to do all three. We can get into why later, if there is interest.

1. Activity duration parameters. One parameter is not enough. You need at least two, the active time (consumes resources) and a wait time (does not).
2. Event parameters. If your diagram has BPMN intermediate events, you need to provide parameters describing both the probability that the event occurs and the time it occurs. Ditto for event-based gateways. And if your tool doesn’t support these events and event gateways, I’d say it’s useless anyway.
3. Cost-per-use parameters for certain activities, in addition to the normal cost-per-hour.
4. Shift calendars for all resources, human and machine.
5. Priority and priority aging logic for resource contention. Which activity gets the resource instance? Which process instance gets the resource? This doesn’t affect the mean as much as the distribution of costs and times.
6. Contention for resources across multiple processes. (My tool didn’t have this.)
7. Breakdown of metrics at process, activity, and resource level.
8. Histograms of cost and time. Means are almost useless. It’s the outliers you need to worry about.
9. Resource utilization rate calculation.
10. Raw simulation data at the activity-instance level, ideally dumped into a tool like Excel that you can manipulate easily. Creating Excel named ranges? Even better! You can’t really do simulation analysis without this data. The standard reports are never exactly what you want.
11. Some rudimentary form of instance-level properties used by the simulation engine to identify types of work, and the ability to define simulation parameters as expressions of these properties. My tool doesn’t have this. It’s a slippery slope from instance properties to a full data model and a shift from simulation to emulation and debugging. We don’t want to go down that slope. Well maybe just a little. To do anything with quality metrics (not just cost and time) I think you need instance properties.

And once you have all this stuff, you need to show business process analysts how to use it, cookbook-style, for the three simulation use cases I described, maybe others.

I could go on, but you get the point. You say your simulation tool doesn’t provide many of these things? That’s because it’s a fake feature.