2009-04-01

On UML

Over the last few weeks I've been working on kin, and hanging around stackoverflow. For one question on usefulness of UML, I ended writing a rather long answer, so as I haven't blogged in a bit I thought I'd post it here too.


There's a difference between modelling and models.

Initially in the design process, the value in producing a model is that you have to to get to a concrete enough representation of the system that it can be written down. The actual models can be, and probably should be, temporary artefacts such as whiteboards, paper sketches or post-it notes and string.

In businesses where there is a requirement to record the design process for auditing, these artefacts need to be captured. You can encode these sketchy models in a UML tool, but you rarely get a great deal of value from it over just scanning the sketches. Here we see UML tools used as fussy documentation repositories. They don't have much added value for that use.

I've also seen UML tools used to convert freehand sketches to graphics for presentations. This is rarely a good idea, for two reasons -

1. most model-based UML tools don't produce high quality diagrams. They often don't anti-alias correctly, and have apalling 'autorouting' implementations.
2. understandable presentations don't have complicated diagrams; they show an abstraction. The abstraction mechanism in UML is packages, but every UML tool also has an option to hide the internals of classes. Getting into the habit of presenting UML models with the details missing hides complexity, rather than managing it. It means that a simple diagram of four classes with 400 members get through code review, but one based on a better division of responsibilities will look more complicated.

During the elaboration of large systems (more than a handful of developers), it's common to break the system into sub-systems, and map these sub-systems to packages (functionally) and components (structurally). These mappings are again fairly broad-brush, but they are more formal than the initial sketch. You can put them into a tool, and then you will have a structure in the tool which you can later populate. A good tool will also warn you of circular dependencies, and (if you have recorded mappings from use cases to requirements to the packages to which the requirements are assigned) then you also have useful dependency graphs and can generate Gantt charts as to what you need for a feature and when you can expect that feature to ship. (AFAIK state-of-the art is dependency modelling and adding time attributes, but I haven't seen anything which goes as far as Gantt.)

So if you are in a project which has to record requirements capture and assignment, you can do that in a UML tool, and you may get some extra benefit on top in terms of being able to check the dependencies and extract information plan work breakdown schedules.

Most of that doesn't help in small, agile shops which don't care about CMMI or ISO-9001 compliance.

(There are also some COTS tools which provide executable UML and BPML models. These claim to provide a rapid means to de-risk a design. I haven't used them myself so won't go into details.)

At the design stage, you can model software down to modelling classes, method and the procedural aspects of methods with sequence diagrams, state models and action languages. I've tended not to, and prefer to think in code rather than in the model at that stage. That's partly because the code generators in the tools I've used have either been poor, or too inflexible for creating high quality implementations.

OTOH I have written simulation frameworks which take SysML models of components and systems and simulate their behavior based on such techniques. In that case there is a gain, as such a model of a system doesn't assume an execution model, whereas the code generation tools assume a fixed execution model.

For a model to be useful, I've found it important to be able to decouple the domain model from execution semantics. You can't represent the relation f = m * a in action semantics. You can only represent the evaluation followed by the assignment f := m * a, so to get a general-purpose model that has three bidirectional ports f, m and a you'd have to write three actions, f := m * a, m := f / a, a := f / m. So in a model where a single constraint of a 7-ary relation will suffice, if your tool requires you to express it in action semantics you have to rewrite the relation 7 times. I haven't seen a COTS UML tool which can process constraint network models well enough to give a sevenfold gain over coding it yourself, but that sort of reuse can be made with a bespoke engine processing a standard UML model. If you have a rapidly changing domain model and then build your own interpreter/compiler against the meta-model for that domain, then you can have a big win. I believe some BPML tools work in a similar way to this, but haven't used them, as that isn't a domain I've worked.

Where the model is decoupled from the execution language, this process is called model driven development, and Matlab is the most common example; if you're generating software from a model which matches the execution semantics of the target language it's called model driven architecture. In MDA you have both a domain and an implementation model, in MDD you have a domain model and a specialised transformation to map the domain to multiple executable implementations. I'm a MDD fan, and MDA seems to have little gain - you're restricting yourself to whatever subset of the implementation language your tool supports and your model can represent, you can't tune to your environment, and graphical models are often much harder to understand than linear ones - we've a million years evolution constructing complex relationships between individuals from linear narratives, (who was Pooh's youngest friend's mother?) whereas constructing an execution flow from several disjoint graphs is something we've only had to do in the last century or so.

I've also created domain specific profiles of UML, and used it as a component description language. It's very good for that, and by processing the model you can create custom configuration and installation scripts for a complicated system. That's most useful where you have a system or systems comprising of stock components with some parametrisation.

When working in environments which require UML documentation of the implementation of a software product, I tend to reverse engineer it rather than the other way.

When there's some compression of information to be had by using a machine-processable detailed model, and the cost of setting that up code-generation of sufficient quality is amortized across multiple uses of the model or by reuse across multiple models, then I use UML modelling tools. If I can spend a week setting up a tool which stamps out parameterised components like a cookie-cutter in a day, and it takes 3 days to do it by hand, and I have ten such components in my systems, then I'll spend that week tooling up.

Apply the rules 'Once and Once Only' and 'You Aren't Gonna Need It' to the tools as much as to the rest of your programming.

So the short answer is yes, I've found modelling useful, but models are less so. Unless you're creating families of similar systems, you don't gain sufficient benefit from detailed models to amortize the cost of creating them

Labels:

1 Comments:

Blogger Michael Duffy said...

Pete, I just found your blog today by clicking on the link you have in your SO bio. It's always been good for me to read what you have to say on the Java Forum and Stack Overflow, but your blog takes it to another level. Really terrific stuff, as always.

6:05 PM  

Post a Comment

<< Home