Improving Process Model Validation

I spend a lot of my time working on tools to validate BPMN process models. You might ask, don't BPMN tools do that already themselves? The good ones do, but only according to their own interpretation of the rules in the BPMN specification. It is unforgivable, but in the 7+ years that have elapsed from the publication of BPMN 1.0 until today, the spec has never actually enumerated its rules in one place, such as an Appendix. Instead, a tool vendor must comb through 500 pages of text and tables plus a UML metamodel and XML schema to glean the rules. These four sources of truth are supposed to agree, but they do not in all cases. But this is not the real issue. For most rules of the BPMN spec - a sequence flow may not cross a process boundary - all tools agree and their validation button can check them.

A bigger problem, in my view, is that a model can be perfectly valid according to the rules of the spec without clearly revealing the process logic from the BPMN diagram. Fixing that problem requires what I call "Method and Style", and my BPMN training shows you how to do that. A key part of the training is the post-class certification that tests the student's ability to create "good BPMN" - not only correct, but clear, complete, and consistently structured diagrams. The certification involves, among other things, submission of a process model to me for review. As the training volume has grown, the amount of time I spend grading student certification exercises has, too... so much so, in fact, that I had to develop a tool to validate models against my own Method and Style rules - and require students to use it! - just to make the grading effort manageable.

Originally, it was a little cumbersome. You had to save the model in Visio's XML (.vdx) format, upload it to my website, and click to see a list of validation errors. But a year ago, itp commerce coded the rules inside their BPMN Visio add-in, the tool I use in my BPMessentials training, and that made a huge difference. Now, it's one click inside the BPMN editor to validate, so fixing and revalidating is quick and seamless. That change made a huge difference. The percentage of students seeking certification tripled, the volume of errors in the student exercises dropped by about 80% and my time to review each submission declined an equal amount. More important, student learning improved dramatically. After the tool tells you two or three times about the same mistake, you stop making that mistake. The instant feedback effect has dramatic results.

OK, problem solved, right? I still want more. There are still categories of errors that the tool does not catch:

  1. Incorrect merge semantics. An AND gateway should be OR, or a multi-merge requires an AND or OR gateway join.
  2. Deadlocks. An exception on one path in a parallel split-join block causes a deadlock at the join. Usually the fix is to make the exception end state Terminate or Error.
  3. Loop within a loop. A subprocess marked as standard loop contains an explicit gateway loopback in the child level. This is not incorrect per se, but usually is not the modeler's intent.
  4. Labeling errors. My tool today can check for a label vs no-label, but without understanding English grammar (not to mention other languages) it cannot check that the label is appropriate for an activity (verb-object), message flow (noun), or end event (adjective or noun-adjective phrase).
There is a problem with all four types of errors. They are hard to catch. I don't know of any tools that even try. Validation tools typically just perform static analysis, but items 1-3 on the list above require dynamic analysis. That means the validator must trace through the flow, step by step, from start to end event. Without doing that, it cannot tell what the proper merge gateway should be, or whether an end event should be Terminate, or whether an incoming sequence flow represents a loopback or not. Item 4 requires a tool that can apply grammar rules to node labels. Gero Decker of Signavio says he knows someone doing research in this area, and that is very interesting to me. But I have been working on items 1-3 myself.

Actually I am really close. I can find the proper merge type required, but I am still working on how that would change after an end event that should be Terminate is fixed. Also, certain merge patterns are just plain illegal (ambiguous semantics) in BPMN, e.g. looping back from one branch of a split-join block to a point before the split, and I have not finished identifying all of them.

I fantasize about the day when process model validation can be 99% automated - BPMN spec errors and all Method and Style errors, including those requiring dynamic analysis and grammar rules, are not only identified but (if the user accepts) automatically corrected in the tool. That's what I'm working on (and I'm not an IBMer). The only thing left then is the business logic... which, at this level, is much less often a major problem.

This work parallels the application of machine scoring to student essays. As reported by USA Today, in 2011 "the Educational Testing Service presented evidence that a pilot test of automated grading of freshman writing placement tests at the New Jersey Institute of Technology showed that computer programs can be trusted with the job. The NJIT results represent the first 'validity testing' — in which a series of tests are conducted to make sure that the scoring was accurate — that ETS has conducted of automated grading of college students' essays. Based on the positive results, ETS plans to sign up more colleges to grade placement tests in this way — and is already doing so." The machine scoring emphasizes sentence structure more than coherent meaning, but since the human and machine scores mostly agree, it seems that simple things like spelling and grammar are the main problems with student essays... just as I have found with student BPMN exercises.

You don't need to wait for my next generation tools. Most problems are caught by today's version. If you are interested in learning how to create "good BPMN" - and get access to automated Method and Style validation - there is still space in my next live-online class, June 26-28 from 11am to 4pm ET each day. Here's the link for more info, and the link to sign up.