Cracking OIL: A Formal Perspective on an Industrial DSL for Modelling Control Software

Olav Bunte

Promotors: dr. Tim A.C. Willemse (TU/e)
Co-promotors: dr. Louis C.M. van Gool (Canon Production Printing)
Eindhoven University of Technology
Date: 5 September 2024
Thesis: PDF

Summary

In the past decades, the amount of software in the world has grown immensely. Not only has the number of software systems grown, but software systems themselves have also grown in size, increasing their complexity. The complexer a software system is, the more it costs to maintain them and the more prone they are for introducing bugs. To combat this complexity, many companies have already adopted model driven software engineering, where the systems are first modelled with higher level modelling languages, often known as Domain-Specific Languages.

In this thesis, we focus on a specific DSL called OIL, developed at Canon Production Printing. OIL is a language originally developed for analysing the behaviour of control software, but it has since been extended to enable the modelling of control-software components, which is our focus. We explain the concepts of OIL informally and give its syntax.

To define exactly the behaviour of a component modelled with OIL, we define the formal operational semantics of OIL in terms of a Labelled Transition System (LTS). With such a formal semantics, we open the way to model checking, which is the act of automatically proving the truth of a property on a system. To use the model checking capabilities of the mCRL2 toolset, we define a translation from OIL to mCRL2, and we prove that mCRL2 specifications that result from this translation are behaviourally equivalent to the original OIL specification. We test the application of model checking by checking some validity requirements on two OIL component specifications that have been used in practice at Canon production printing.

OIL also allows one to define OIL systems, which are systems of asynchronously communicating instances of OIL components. Like for individual components, we define the formal operational semantics for OIL systems and define translations from an OIL system to mCRL2. We discuss a number of properties that are relevant for OIL systems and we test the feasibility of model checking OIL systems by exploring the state space of a configurable example.

Asynchronously communicating systems typically use FIFO channels to store the messages that are still in transit. These channels can be laid out in the system in many ways, which affects the behaviour of the system. In this thesis, we explore exactly how the behaviour of the system changes when the channel layout changes. Also, we show for safety properties and for the properties reachability of state, deadlock freedom and confluence, under what conditions their truth is guaranteed to be preserved when changing the channel layout.

In the first phase of the project, OIL was migrated from its original implementation in Python to the language workbench Spoofax. Language workbenches are tools that are specifically designed for DSL engineering, as many of the papers on language workbenches emphasize. However, most of such papers only show the benefits of a language workbench with a toy example, which leaves the question whether these benefits still apply for an industrial case such as OIL. We evaluate this by comparing both the Python and Spoofax implementations on code volume necessary to define software artifacts such as parsers, type checkers and transformations, as a proxy for productivity. Based on our analysis, we provide some lessons learned and an engineering agenda for Spoofax.