Visio Premium's BPMN template is a drawing tool for BPMN diagrams, not a true modeling tool that internally understands the BPMN metamodel. It does provide a validation feature that uses Visio 2010's new Validation API to check the diagram against the rules of the BPMN 1.2 spec, and that is helpful. But deep down, the "model" is just shapes on pages. Nevertheless, it is possible to generate from the diagram a true BPMN 2.0 model structure and serialize it appropriately in XML. This post explains the issues involved.
Generating the model structure and BPMN 2.0 XML export requires two things: 1. Determining the BPMN element represented by each shape, and 2. Determining the relationships between the shapes. Those relationships include connections (e.g., sequence flows) and attachments (e.g., boundary events), containment (e.g., pools and lanes), and hyperlinks (e.g., collapsed subprocess to child-level page).
Both of those sound straightforward, but neither is as simple as it seems.
The template provides several BPMN stencils, each a set of masters, BPMN element shapes you can drag out onto the page. Some masters are generic, like gateway or event, and others are specific, like AND gateway or Message start event. Actually, the BPMN element represented by a shape on the page is dynamic, changeable after placement on the page.
Masters are arranged in families called categories, and the modeler can reconfigure the shape to mean any BPMN element in the category at any time. For example, you could drag out a Timer start event from the stencil and later right-click the shape and change it to a Message end event. You can even change a task to a subprocess, even though they are different categories. The shape's master does not change, but some internal properties do. Discovering those properties and how they are exposed in the Visio XML (.vdx) file was not easy. Microsoft might dispute this, but I would call it completely undocumented. (The Visio 2010 object model and API are documented, but not so much the BPMN template. Besides, my interest was in creating XSLT 2.0 rather than a VBA plug-in to Visio, so I focused on the vdx rather than the binary object model.)
Underneath each shape in any Visio diagram is a ShapeSheet, an Excel-like worksheet filled with shape properties and formulas for their values. Any edit to the Visio diagram executes all the ShapeSheets, and this is what gives Visio its power and dynamic character. ShapeSheet property values are serialized in the vdx (in addition to access via the programmer API).
BPMN Element Determination
To determine the BPMN element represented by a shape, you need to do the following: 1. Get the shape's master. 2. Get that master's category. This determines the family of BPMN elements and the Visio properties available to determine the specific BPMN element. In the Event category, for example, such properties include BpmnEventType and BpmnTriggerOrResult. 3. Look for the values of the BPMN properties in the shape element. If they are missing there, you need to look for them in the master element. 4. Most of the BPMN-related properties you need are enumerated values, what Visio calls fixed list type. The value is an index to the list of text values allowed for that property, and this allows the text value for a particular index to change with the user's language. If you want your BPMN export to work with Visio in different languages, you need to use the index value (an integer) not the text value. Both are available in the vdx. 5. In some cases you also need to consider relationships to other shapes in order to determine the BPMN element. Examples are intermediate catch event vs boundary event, or pool vs lane.Connections
Sequence flows, message flows, and associations are Visio shapes in the same master category, so you can change one type into another by right-clicking. To determine the type of connector, you need to follow the procedure described earlier for determining the BPMN element type. All connectors also require references to the BPMN elements at their source (tail) and target (head). To get that, you need to look at a table called Connects, which provides the ID of the shapes attached to the connector head and tail.Containment
Visio 2010 introduced the notion of container shapes. Containment relationships are critical to generating the BPMN structure - lanes in a pool, shapes in a lane, etc. The vdx exposes them through the Layout/Relationships element, new in Visio 2010. To get a list of shapes in a container, you need to do the following: 1. Get the container shape element. 2. Get the formula (@F) attribute of the shape's Layout/Relationships element. 3. Parse the formula string to get a list of shape references. The shape reference could either be [masterNameU] or [masterNameU.n], where masterNameU is the Universal Name of the shape's master, and n is the shape ID. Shape ID is unique on a page, but not unique in the Visio document. Just to make it nasty, some masterNameU values themselves have .n suffix. Each shape does have a true globally unique ID, but Visio doesn't appear to use it internally. 4. For each shape reference, get the shape element.Pools and Lanes
Pools and lanes pose a special challenge. The BPMN structure for a page with two pools is very different from one with two lanes. But in the stencil, pool and lane share a common shape. That stencil shape represents 5 different masters - CFFContainer, Pool/Lane, SwimlaneList, PhaseList, and Separator. When you drag it onto an empty page, it generates all 5 shape elements, even though it looks like a single shape. Those shape elements are related to each other by the aforementioned containment formulas and shape references. But when you drag the pool/lane stencil shape onto an existing pool, it just adds a single lane to the pool. Combining pool and lane in this way allows Visio to leverage some existing drawing behavior of Cross Functional Flowcharts, but it complicates the BPMN export.For example, what looks like the pool label in the diagram could be the name of the pool shape or the name of the lane shape, depending on how many lanes are drawn. Also, containment of elements in a pool is indirect, via the SwimlaneList shape.
To resolve the ambiguities, certain conventions must be observed. While BPMN allows lanes without enclosing them in a pool, that is not allowed in Visio Premium BPMN. What looks like a pool with no lanes in Visio will generate a pool and a single lane in the BPMN, as will a single visible lane inside a pool shape. (The latter is tricky to draw in Visio.) The modeler must take care to put all the flow nodes inside the pool shape. If one is drawn carelessly beyond the edge of the pool, such that Visio does not recognize the containment, it will generate an invisible "main pool" (in BPMN 2.0, that is another process and participant)... and probably also validation errors.
Hierarchical Modeling
One of the nice things about Visio Premium BPMN is its support for the hierarchical modeling style, in which a collapsed subprocess is expanded on a separate hyperlinked page. In the vdx, the collapsed subprocss has a special hyperlink property pointing to the Name of the child page (tab name in Visio). Unambiguous serialization from the diagram depends on strict distinction between child-level and top-level pages in the Visio file. A child-level page - one pointed to by a subprocess hyperlink - may not contain flow nodes of any other process. Visio lets you draw them, but it will not serialize properly. A child page may contain black-box pools, but not, for example, another white-box pool exchanging message flows with the child-level elements. Elements of such a second process create ambiguity in the serialization: which process is the hyperlink pointing to as the child level? Note: In the BPMN 2.0 XML, the pointer goes the other way, from the child page (in the graphical model) to the subprocess (in the semantic model). But the ambiguity is the same.Dealing with Modeler Errors
Most modeler errors do not affect the model structure or serialization, and can be flagged by validation of the generated BPMN 2.0 XML. I have created an XSLT 2.0-based tool that does the BPMN 2.0 XML export and then validation (including my Method and Style rules) in a single operation. However, Visio lets you draw certain BPMN models that are not serialized properly and result in possibly spurious validation errors.1. When a child page contains elements of a second process, or when the link between parent and child level pages is broken (I see the latter sometimes, which seems like a Visio bug). 2. When multiple Visio Pool/Lane shapes, not enclosed in CFF Container, are intended to mean multiple lanes in a single invisible pool. 3. When a Link event pair is used as an off-page connector. This is legal in BPMN but not supported in my serialization. (I'm not sure if Visio supports it, either, since it is doubtful that a subprocess in Visio could point to multiple sibling child-level pages connected by Link events. In BPMN 2.0, the link is the other way, from each page to the subprocess, so conceptually possible there.) 4. When a Visio page is used as documentation and any BPMN shapes are used on that page. 5. When the modeler's intended containment, connection, and attachment relationships are not recognized by Visio. Improper connection or containment can generate strange model structures.
I have created an online tool that performs the BPMN 2.0 serialization (of the Analytic subclass elements supported by Visio Premium,and, optionally, Method and Style validation as well). The transform operates on uploaded vdx from the Visio Premium BPMN template. It uses the Altova XSLT 2.0 engine through an iFrame. I am not too happy with the web app implementation, which appears to fail without warning if the user has any popups blocked. (If you know C#, maybe you could help me with that.) With that disclaimer, if you are interested in trying it out, let me know. It needs special permission to access at the moment.