Fulfilling the Promise of Process Simulation

First, let?s deconstruct the promise. As a management discipline, BPM asserts that measures of process improvement should be quantified in the process model and then measured in the implementation. A key assumption of process modeling is that by making reasonable estimates of a few parameters for each step in a business process, a business analyst can estimate the values of key performance indicators even without knowing the technical details of the process implementation.

Those KPIs typically represent one of three types of measurements: times, costs, or some quality metric such as defects. Incremental changes to the process model will generally make some KPIs better and make others worse, and tradeoffs between cycle times, process costs, and defects are to be expected. Thus, the ?optimum? process design and configuration is a matter of business judgment.

In process modeling, each activity is assigned to a defined resource, which could be a system or a human task role or workgroup. In BPMS, simulation analysis most often comes into play when a pool of workers performs tasks from a shared queue. In that analysis, the two critical simulation parameters are the average task duration and the hourly cost of the assigned resource. If you configure more resources for the task, backlogs (and hence cycle time) will decrease but costs will increase.

In addition, process models describe branch points in the flow. At each such branch, some of the instances follow one path, and some follow other paths. Thus a third basic simulation parameter is the fraction of instances following each path out of a branch point in the flow. Add to this a fourth parameter, the fixed costs of a process activity (other than those captured in the resource cost), and you?ve pretty much exhausted the capabilities of most BPMS simulation tools.

The first question to ask is what value a simulation engine adds to simple static analysis, such as you could do with Excel or even pencil and paper. Let?s say Task A takes 15 minutes and you have on average 20 instances an hour. It doesn?t take simulation analysis to know that, on average, 5 workers can handle the volume and that adding more will decrease occasional backlogs but add some cost and unutilized capacity.

But if you know not only the average volume of process instances but their profile ? for example, the fact that 40% of the weekly volume is generated at 8am on Monday morning ? you can get a more detailed picture using simulation analysis. For example, what is the maximum expected backlog at Task A, when does it occur, and what are its effects on service level agreements for the process as a whole? These results require dynamic analysis that only a simulation engine can provide.

Unfortunately, that added value requires simulation capabilities that frequently do not exist in BPMS modeling tools, such as the ability to model instance creation profiles (not just volume) and resource schedules (shifts, holiday calendars, time required for other activities). Also, the simulation model should be able to account for resource contention among multiple concurrent processes; only a few of them can.

Thus, problem number one for simulation in BPMS is the inability to adequately describe the dynamic nature of instance creation and task performance, making it often little better than simple static analysis.

The second problem is that simulation in BPMS does not account for differences among process instances unless they result in different paths in the flow diagram. This is perhaps the fundamental distinction between real discrete event simulation engines and their ?lite? counterparts in BPMS.

Consider, for example, an order handling process. Even if a step in the process is independent of who the customer is or the content of the order, the pool of resources available to handle it (and their associated costs), as well as the task?s duration and fixed costs, may be customer- or order-dependent. Moreover, the path followed at any branch point in the flow may be customer- or order-dependent as well, not ? as BPMS assumes ? a random probability.

This can be a slippery slope, since you don?t want to require defining an entire data model (plus business rules) to do process modeling and simulation. But simulation models should provide at least a few user-defined instance properties and allow them to be referenced in simulation parameters. Instance properties in the simulation would also allow the results to be used for activity-based costing as well as for modeling the tradeoff between time, cost, and defects. For example, an instance might be rerouted to a secondary resource pool with a higher defect rate in order to meet a service level agreement. Instance properties allow simulation to capture quality metrics as well as the costs of correcting defects downstream.

A third problem, usually connected to the process models themselves, concerns incorporation of exception handling into the simulation. What happens, for example, when a customer cancels or changes the order while it is being processed? What happens when a system error occurs and a ?business transaction? has to be unwound through an exception process and compensation? In traditional process modeling, these conditions were generally omitted from the process diagram, often considered implementation details. But if you believe the old saw that 80% of the cost comes from 20% of the work ? the exceptions ? you see that approach cannot lead to an accurate simulation result.

New modeling standards such as BPMN include exception conditions as first-class constructs in the model ? represented by so-called intermediate events. Today, few BPMS vendors support these constructs, and even fewer attempt to include them in simulation. An intermediate event attached to Task A means if the event occurs, Task A is terminated immediately and the exception flow emanating from the BPMN event symbol is triggered. The event could be attached to a subprocess or even to the process as a whole. The simulation parameters for intermediate events would need to include a probability of occurrence, and perhaps a time of occurrence (mean and distribution). Moreover, the simulation engine would need to follow the event-handling logic of the BPMN model ? which most simulation models cannot do today.

For most BPMS vendors, modeling and simulation were an afterthought, since they are primarily in the runtime business. To be fair, striking the right balance is hard. Process modeling is supposed to be one of those anyone-can-do-it steps, and BPMS vendors don?t want to create a huge learning curve to get started. But to fulfill the core promise of simulation ? projecting KPIs resulting from implementing the to-be process ? BPMS needs to beef up the tool in these three areas: instance volume and resource availability profiles, instance properties, and handling of events.