Transforming Text to Diagrams. Part 2: Automated Generation of System Architecture Diagrams

Home/Publications/CIDM eNews/CIDM eNews 12.16/Transforming Text to Diagrams. Part 2: Automated Generation of System Architecture Diagrams

Alex Masycheff, Intuillion

Suppose you are part of a documentation team that documents a complex product. The product consists of multiple components, and each component is assigned to a certain technical writer or a group of writers.

Each writer knows everything about the components assigned to her and has a general knowledge about other components. Someone (perhaps, a team lead) needs to put individual pieces of information written by different writers together and describe how the system works as a whole. At this point, the team lead needs to write an overview that, at the very least, includes a system diagram and a brief description of the components depicted on the diagram.

As long as the diagram is created just once without any need for updates, it’s not a problem. The problem arises when this diagram needs to be updated after a component is changed, a new component is introduced, or maybe after an existing component became obsolete. Or when there are multiple flavors of the product, and each flavor includes a slightly different set of components, or in each product flavor, the components interact differently.

In this case, re-drawing the diagram or creating multiple variations of the diagram might become a nightmare.

Generating Content Automatically

We’ve built an application for our content automation platform to address this challenge. The application searches in the content repository for pieces of information required for a specific product, automatically generates a system diagram which shows how major components interact with each other, automatically creates a table with a brief description of each component, and automatically creates a topic which includes the diagram and the table.

When something changes in the components, you just need to re-run the generation process.

To help you get a general idea of how it works, let’s say you need to document a system that automatically recognizes sock pairs after doing the laundry. The system consists of several components:

  • Image Capturing, which takes a photo of every single sock and saves the photo to the Image Database.
  • Image Database, which stores photos of all the socks to be sorted out.
  • Image Matching, which searches for photos stored in the Image Database and finds two photos in which the image patterns match.
  • Automatic Hand, which receives instructions from the Image Matching component. If the pair is found, the Automatic Hand picks up two paired socks and puts them aside. If the pair is not found, the Automatic Hand puts unpaired socks in a separate pile.
  • Burner, which burns up the pile of unpaired socks.

For the sake of simplicity, let’s assume that each component is assigned to an individual writer. The writer documents the component as usual and is expected to do just two additional things:

  • In the main topic about the component (for example, in the one that provides a general description of the component), the writer adds a short description of the component at the beginning of the topic. For example, in DITA, the <shortdesc> element can be used for this purpose.
  • The writer fills out a very simple table to specify from which component the data is received and to which component the data is sent. For example, Image Database receives data from the Image Capturing component and provides data to the Image Capturing component. The writer just needs to select a component from a pre-defined list so there is no need to manually type anything (see Figure 1).

figure1

Figure 1: Image database component I/O definition

If a component interacts with other components differently in different product versions or flavors, several possible inputs or outputs can be added and profiled. Then inputs and outputs that are irrelevant for a particular product will just be filtered out, and the diagram will be generated accordingly.

The application finds the topics required for a specific product based on metadata, retrieves the text from <shortdesc> (or any other element used for the short description), generates the system architecture diagram based on the input/output tables, and puts the diagram followed by the table with short component descriptions into an overview topic (see Figure 2).

figure2

Figure 2: Automatically generated topic

Click here (http://intuillion.com/samples/architecture2diagram/system_architecture_overview.html) to see an HTML output of the automatically assembled topic (the text at the beginning of the topic and between the image and table is coming from the template).

This example shows what can be done. Everything is configurable. For example, the image can be generated as SVG, it can be clickable (that is clicking on a component will bring you to the topic that describes this component), the look-and-feel of the diagram blocks and arrows can be changed, the components can be re-arranged, and so on.

Following are links to two videos that showcase ideas presented in this article:
https://youtu.be/l2drJxXWdU8 – generating troubleshooting flowcharts from textual troubleshooting topics
https://youtu.be/Z_MHx8M_39A – generating a system architecture diagram from topics and aggregating short descriptions of system components