DMN and XML: Lists of Lists of Lists... Oh My!

When we create a decision model, we imagine we can design the input data in a way that is most convenient for our logic. But that is not always the case. It could be that the input to our model is required to be XML in some industry-defined format, and we need to convert that to FEEL. Trisotech Decision Modeler is great at that. Importing XML schema automatically generates FEEL item definitions equivalent to it, and assigning DMN input data to the top-level item definition allows the input data to be submitted as XML and converted automatically to FEEL. No programming is required! That's amazing.

But some of these industry-defined schemas present a challenge nevertheless, particularly those organized as deeply nested structures containing multiple repeating container elements. That is the case, for example, with the schemas defined by MISMO, the data standards organization of the Mortgage Bankers Association. Designed to act as a sort of enterprise logical data dictionary for anything related to mortgage processing, they require some care when used with DMN.

Three years ago, I posted about Fannie Mae's white paper "Without Data Standards, the Mortgage Industry Doesn’t Go Digital", explaining the role of MISMO data in spurring modernization of the historically slow-moving mortgage industry and the potential of DMN in that effort. It's taken a while, but DMN implementations are now starting to happen. My role in that, while limited so far, is already teaching me some new tricks. This post will explain.

In uses ranging from mortgage loan applications to appraisal reports, organizations like Fannie Mae require the data to be delivered as XML in MISMO schema format. As mentioned earlier, this schema is not organized as relational tables linked via keys, but as tables nested inside tables nested inside other tables. Like it or hate it, that's how the data is organized.

In a follow-up post to the previously mentioned one, I showed how to simplify MISMO data for DMN processing by first putting it through a decision service using XSLT to create a more FEEL-friendly data structure. I worried, however, that injecting a foreign language like XSLT into the mix might negate the whole DMN value proposition: Why not do the whole thing in XSLT? The key value of DMN is logic transparency, particularly to subject matter experts who are not programmers.

And that worry has proven correct. My client wants the whole thing to use DMN.

The tricky part is handling the deeply nested tables. That's because FEEL handles lists of lists differently than its XML equivalent, XPATH. XPATH has sequences, not lists. All sequences are flat. There are no sequences of sequences, and a sequence containing one item is identical to that item. That's not the case with FEEL. FEEL has lists of lists, potentially lists of lists of lists of lists...

In XPATH, a filter returning a sequence of one item actually returns that item. In FEEL, a filter always returns a list, even if it is known to contain a single item. And, significantly, a list containing one item is not identical to the item.

Let's see how that applies to MISMO data. Let's say we receive an Appraisal Report in MISMO format and we want to know about the rooms in the house. The FEEL path to that data element is something like this:

MESSAGE...SERVICE[VALUATION != null]...PROPERTY[ValuationUseType="SubjectProperty"]
where the ellipses represent chains of non-repeating container elements of no interest to this discussion. Following the root MESSAGE, the path to INTERIOR_ROOM_SUMMARY, itself a repeating container, passes through three repeating containers, i.e. lists: SERVICE, PROPERTY, and IMPROVEMENT. Even after filtering to select a single item, they are still lists.

To deal with lists of lists, FEEL includes the flatten() function. It turns a list of lists into a flat list. I always thought that it turned a list of lists of lists into a flat list as well, but that apparently is not the case. That returns null. The limit is lists of lists. So, in fact this does not work:

flatten(MESSAGE...SERVICE[VALUATION != null]...PROPERTY[ValuationUseType="SubjectProperty"]
My next attempt was to break up the chain into pieces no more than two lists deep and apply flatten() to those, resulting in an expression with a bunch of nested flatten() functions. That works... but it's nasty, and it defeats the objective of logic transparency. After a couple days chasing down the flatten() rabbit-hole, I came upon a much better solution. It is simple and, in hindsight, obvious.

In real situations, it is extremely rare that you actually want to flatten a list of lists in which both lists contain multiple items. In almost all cases, all except one of the lists are singletons, i.e. contain only one item. But remember, in FEEL a singleton list is not the same as the item. You need to extract the item using [1], a filter with integer argument. So the correct way to get the table INTERIOR_ROOM_SUMMARY starting from MESSAGE is this:

MESSAGE...SERVICE[VALUATION != null][1]...PROPERTY[ValuationUseType="SubjectProperty"][1]
Here in fact there is only one Valuation Service, only one Subject Property, and only one Dwelling, so simply appending [1] to those filters turns the singleton list into an item, and the whole chained expression returns the expected result. No flatten() is required. In a real decision model, every decision would not start from the input data MESSAGE. More likely you would have supporting decisions like Subject Property or Dwelling Improvement, shortening the FEEL expressions for their details.

One other tidbit I learned from this experience: With XML inputs like MESSAGE, in Execution/Test you want to use the File input feature, pointing to the XML file on your desktop, not the html form. When using File input, do not save the result as a Test Case. That maps the XML to html, and when you try to Load and run the Test Case it will fail. Instead, you need to resubmit the file.

Now that we know how to extract elements from MISMO XML, using DMN for purposes ranging from data validation to risk analysis is a piece of cake!