Break the Code? Breaking Changes and Their Impact on Software Evolution

Lina María Ochoa Venegas

Promotors: prof.dr. J.J. Vinju (TU/e) and prof.dr. M.G.J. van den Brand (TU/e)
Co-promotor: dr. T.F. Degueule (Université de Bordeaux, CNRS, LaBRI)
Eindhoven University of Technology
Date: 29 March, 2023
Thesis: PDF

Summary

Software seldom lives in isolation. Instead, projects dwell in software ecosystems where they depend on each other to favour reuse. Software projects have a dual role: (i) the library role when exposing a set of services to other projects, and; (ii) the client role when depending on other libraries to leverage their functionality.

As time passes, library developers introduce changes to include functional and extra-functional enhancements. Although changes aim at increasing the library’s value, they might propagate to client projects resulting in broken code. Versioning schemes—such as semantic versioning—are often used to communicate the nature of introduced changes namely, changes that potentially break (or not) client code. Nevertheless, library developers still face a dilemma: whether to introduce changes at the cost of increasing the technical lag on their clients or even losing them; or avoid change at the cost of increasing technical debt. This thesis states that breaking changes are not harmful by themselves. However, their impact should be first assessed, so developers can make informed decisions on their introduction and subsequent coping strategies.

In this thesis, we address the library-client co-evolution problem from the nounal and the verbal views. On the one hand, the nounal view allows us to empirically understand the nature of the library-client co-evolution phenomenon. In particular, we study (i) best practices to define dependencies as a way of preventing the propagation of breaking changes, and; (ii) syntactic breaking changes and their impact on client projects in relation with semantic versioning.

On the other hand, the verbal view encourages us to provide new processes, methods, and tools that can better support the library-client co-evolution process. Concretely, we introduce (i) the static impact analysis approach to detect breaking changes introduction and their impact on client code, and; (ii) the static reverse dependency compatibility testing approach to perform static impact analysis as part of a pull-based development workflow. The former is implemented in Maracas, a static analysis tool for Java projects, and the latter is implemented in BreakBot, a GitHub bot that assists library evolution.

As the main conclusions of the thesis, we find that: (i) Practitioners do not widely follow best practices when defining dependencies. (ii) Libraries tend to comply with semantic versioning when introducing syntactic breaking changes; the adherence to such scheme has increased over time, and; only a few clients are impacted by these changes. (iii) Tooling to support software evolution is accurate, applicable, and relevant for pull-based development workflows.