As result of some spirited back and forth on my previous post with FTF member Camunda, plus response from Oracle (original BPMN 2.0 Examples team member), I have a bit more information on the intent and usage of process dataInput in BPMN 2.0, and whether it can have incoming and/or outgoing data associations in BPMN 2.0.
To recap:
- The BPMN 2.0 metamodel (UML) allows dataAssociation between any item-aware elements, which include dataInput, dataOutput, dataObject, property, and dataStore. Camunda uses this to justify dataInput of a process as the source of dataInputAssociation to the dataInput of a process activity. However…
- Metamodel UML is further constrained by statements in the narrative, such as a subprocess dataInput cannot have dataInputAssociation (although allowed by UML and XSD). Camunda says UML (“for implementers”) overrides spec narrative (“for end users”). Hmmm… that last bit is doubtful.
- Scope of a data association was intended to be local to an activity or event, with the dataInput or dataOutput of the activity or event on one end, and a stored element – either a dataObject, dataStore, or property (or references to same) – at the other end. The statement allowing data association between “any” pair of item-aware elements was supposed to eliminate verbose enumeration of all the combinations in the narrative… although that has obviously introduced ambiguity.
- According to Oracle, dataInput of a process was intended only to support invocation by callActivity. If process instantiated by message start event, the data association should be from the start event to a dataObject and from there to an activity dataInput. But…
- Spec p213 says: “If the Process is being called from a Call Activity, the Data Associations that target the Data Inputs of the Call Activity in the underlying model MAY be visualized such that they connect to the corresponding Data Inputs of the called Process, visually crossing the Call Activity boundary. But note that this is visualization only. In the underlying model, the Data Associations target the Data Inputs of the Call Activity and not the Data Inputs of the called Process.” In other words, if you represent a called process using the expanded callActivity shape, you can draw a data association [e.g., from a dataObject in the calling process] to the dataInput shape of the called process, but in the XML the connection is actually to the dataInput of the callActivity, not the dataInput of the process.
- And p218: “The DataInputs and DataOutputs of the Call Activity are mapped to the corresponding elements in the CallableElement [i.e., the process] without any explicit DataAssociation.” In other words, the dataInputs of the callActivity and the dataInputs of the called process are identical, by definition.
Bottom line: Camunda/Activiti and Oracle don’t do data flow the same way, and probably models including data flow could not be interchanged. This is not a BPMN-I issue, but a BPMN spec issue. Section 10.3 of the spec seems a bit of a mess, and they ought to clean it up. But I’m not sure there is a “they” any more…
- Spec p213 says: “If the Process is being called from a Call Activity, the Data Associations that target the Data Inputs of the Call Activity in the underlying model MAY be visualized such that they connect to the corresponding Data Inputs of the called Process, visually crossing the Call Activity boundary. But note that this is visualization only. In the underlying model, the Data Associations target the Data Inputs of the Call Activity and not the Data Inputs of the called Process.” In other words, if you represent a called process using the expanded callActivity shape, you can draw a data association
I think “they” ran for the hills Bruce 😉
This is the classic problem with all specifications – In the abstract without a widely acknowledged reference implementation it’s just too hard to agree on what the requirements really mean. It must be comforting to the business folks to know that technical folks are no better at writing requirements than they are 😉
If I was a betting man, I would bet that the open-source Activiti is probably going to become the “reference platform” (at least for Open Source BPM). If it becomes widely adopted, then maybe, just maybe Oracle and IBM will bend to conform.
What do you think of the following proposal for section “10.3.1 Data Modeling”:
(a) p213 (PDF 243) sub-section “Data Input”:
Add the following bullet points to the list before Figure 10.58:
* Process Data Inputs MAY have outgoing Data Associations.
* If a Data Input is directly contained by the top-level Process, it MAY be the source of Data Associations within the underlying model. This way Activities or Events can read the data that was passed into the Process when it was started, e.g., by a Call Activiti inside another Process or a Message Start Event.
(b) p215 (PDF 245) sub-section “Data Output”:
Add the following bullet points to the list before Figure 10.60:
* Process Data Outputs MAY have incoming Data Associations.
* If a Data Output is directly contained by the top-level Process, it MAY be the target of Data Associations within the underlying model. This way Activities or Events can write the data that will be passed out of the Process when it ends, e.g., by a Call Activiti inside another Process or a Message End Event.
(c) p224 (PDF 254) sub-section “DataInputAssociation”:
Add Process Data Inputs to the list of examples at the end of the sentence “The source of such a DataAssociation can be every ItemAwareElement accessible in the current scope, e.g., a Data Object, a Property, or an Expression.”
(d) p224 (PDF 254) sub-section “DataOutputAssociation”:
Add Process Data Outputs to the list of examples at the end of the sentence “The target of such a DataAssociation can be every ItemAwareElement accessible in the current scope, e.g., a Data Object, a Property, or an Expression.”
Writing this proposal reminded me of a model, which I created during the last face-to-face meeting of the BPMN 2.0 Finalization Task Force as a visualization of what we discussed there. Maybe a more complete version of this should be added as an example to the spec:
http://oryx-project.org/oryx/editor;bpmn2.0#/model/7637
http://oryx-project.org/backend/poem/model/7637/png
Falko,
I like it from the perspective of clarity in the spec. I have concerns from perspective of interoperability between tools, as Oracle says (I think) that they could not handle data association from process dataInput to activity dataInput. I don’t mean to be taking their side in this, but with so few BPMN 2.0 engines in existence or in development, it would be best if the key players could agree. So before putting this forward (would that be BPMN 2.1?), I’d like Oracle, IBM, SAP, BonitaSoft, etc to comment… assuming they are not on the 2.1 committee.
–Bruce
Bruce,
Interesting discussion! I tried to add my view as a reply here, but I couldn’t find a way to include a picture here. So I’ve written my answer as a blog entry myself. I’ve included it here as well, but you’ll have to follow the link to find the screenshot I’m referring to in the last paragraph.
It seems to me, from reading the specification, that data input and output associations are meant to be “local”, by which I mean they are not intended to be referenced from outside the element in which they are defined. On top of that, it also seems to me that they are intended to be “immediate” or “instantaneous” (not sure this is the best term but I couldn’t come up with a better one at this point), meaning they are “executed” when they are reached, but they don’t exist anymore after that.
Therefore, I would agree with your statement that process data inputs should not be the source of a data input association, as the process data input association only exists when the process is invoked and at that point, that data input association is executed. [I also believe that using a process data input when the process is actually started by a timer start event confusing, as it’s unclear where this information would come from, but that’s another issue.]
So how would you use your process data inputs in the remainder of your bpmn2 process then? I think the ioSpecification of the process, that contains these process data inputs, should also contain data output associations that map these data inputs to more persistent (as opposed to immediate or instantaneous) item-aware elements like a property or a data object. These can then serve as the source of other data input associations. So you don’t directly use a process data input as the source of a data input association of (for example) a task, but you rather first map the process data input to a property using a data output association (as part of the process ioSpecification) and then use this property as the source of the data input association of the same task.
A screenshot of this in a real process is shown below [follow the link to the original blog in this case to see the image] (note that I deliberately chose a slightly different example to avoid using a start timer event in combination with a process data input as this makes it only more confusing).
Kris