A Fresh Look at Simulation

Probably no aspect of BPM has underperformed versus expectations more than simulation. It should be a valuable tool that is commonly used in the course of process analysis... but it's just not. I've been thinking about why that is, and what it would take to make simulation useful in actual practice. It comes down to two basic things: better tools, and better-defined methodologies for deriving useful results from those tools. I haven't tried every simulation tool out there, so there may be some that do what I ask already.

So what's needed in a tool? To me, it goes without saying that the typical user of simulation tools is a business process analyst doing process redesign, not an implementation developer. Right off the bat, that implies several things that disqualify most of today's tools:

1. Simulation should not be embedded in a BPMS. The results of simulation may be what motivates the purchase of a BPMS, but you should not have to buy an automation runtime just to quantify its expected benefits. In fact, I am not sure it should be tied to a particular modeling tool. I'd like a tool that can import any properly serialized BPMN 2.0 XML file in the analytic subclass. That means it should understand BPMN events, like the probability of a message arrival at a boundary event, and able to follow error throw-catch. And, as you might expect, I'd like it to recognize my Method and Style conventions, such as matching subprocess end states with a gateway that immediately follows it, so that simulated instances that hit the OK end state automatically follow the yes path out of the gateway labeled OK? (The tool I use in my training doesn't do this, and it drives me crazy.)

2. Simulation should not require programming, even javascript. Yes you need expressions to set simulation parameters, so point-click expression-builders are fine.

3. Simulation should address the kinds of structural improvements most common in process redesign, e.g. Lean principles, not just labor utilization efficiency in heads-down workflows. Almost all of the simulation tools I have looked at seem designed to solve the process bottleneck problem: How many widget-tweakers do we need at the widget-twiddling step to relieve the bottleneck? Optimizing that kind of heads-down work is, of course, a possible use of simulation, but it is rarely what process improvement folks are worried about. They are thinking about things like how much checking and rechecking is needed, and should it be up front or at the end of the chain? The tool needs to understand the difference between waiting time (or lead time), which does not consume the resource, and active time, which does. Rework is a key issue; if your model loops back, can simulation allow different parameter values for the second time around? And what is the effect of assigning this task to role A vs role B or some outsourced service provider? There are time, cost, and quality tradeoffs... but most simulation tools cannot expose them.

4. Simulation should provide useful cost and quality metrics, in addition to time-based metrics. Time-based metrics come essentially for free with simulation, but cost and quality metrics, if you can get them at all, require processing the simulation output through Excel or database queries. If I have multiple scenarios that differ in terms of process structure and task assignment logic, I'd like to see not only differences in cycle time but differences in activity-based costing (including indirect costs) and differences in quality or risk metrics as well. The notion of process (and subprocess) end states plays a key role here, as it does generally in BPMN Method and Style. Here simulation methodology interacts with modeling methodology.

5. Simulation should actively assist the user with setting model parameters to match known aggregated metrics of the as-is process. One thing almost everyone can agree on about simulation is Garbage In Garbage Out. So the first thing you MUST do is to accurately model the as-is process and make sure that simulation matches known metrics in the aggregate. Knowing the outputs - e.g. 75% of loan applications of a certain type are approved, and it takes 3 days +/- 25%. - does not mean it is easy for the user to set the branching ratios at every gateway, the probabilities at every event, or even the mean duration of each specific task. The tool should help you do that, at least as a starting point. Once you have a set of reasonable simulation parameters for an as-is model that matches known aggregate results, you need to justify any change to each parameter when you model the to-be.

There is a lot more detail behind each of these, but it's a good place to stop. I am working with a tool now in development that tries to do it right. I'll be writing more about this topic as the tool gets better.