Fall Days on System and Software Analysis
Villa Vennendal, Nunspeet, (see Google maps)
November 6-10, 2017
The IPA Fall Days are an annual multi-day event, dedicated to a specific theme of current interest to the research community of IPA. This year’s Fall Days are dedicated to System and Software Analysis, fitting in the Cyber Physical Systems, Software Analysis and Model-Driven Software Engineering focus areas of the research school for the period 2013-2018.
The programme of the Fall Days is composed with the help of Jurriaan Hage (UU/e), Alfons Laarman (UL) and Erik de Vink (TU/e and CWI).
Some slides of the Fall Days are available online. Click here for the repository of these slides.
Monday 6 November is reserved for the IPA PhD workshop. The remainder of the days are devoted to various themes and topics in System and Software Analysis; details on the programme can be found below. Our keynote speaker for the PhD Workshop is Sung-Shik Jongmans, winner of the IPA Dissertation Award of 2016.
The PhD workshop aims to promote the interaction and cohesion with PhD students from other universities on both a social and technical level, provide a hospitable setting in which presentation skills can be demonstrated and improved through constructive feedback by peers, and be fun. This workshop is open to all IPA PhD students alike. If you are interested in giving a 20-25 minute presentation, send an email to firstname.lastname@example.org.
Registration closes on 27 October. Follow this link to register. As always, IPA PhD students can attend for free. For others, details on pricing will be made available upon request.
To make maximal use of the available capacity, we process applications on the following basis: Registrations are treated “first come, first served”. In principle, all PhD students (IPA and non-IPA) will share a room. Others may also be asked to share if we run out of rooms. Until the registration deadline, we will accept participants until we run out of space at the hotel!
MONDAY 6 NOVEMBER: PHD WORKSHOP
|13:30-13:35||Opening Fall Days and PhD Workshop|
|13:35-14:05||Sangeeth Kochanthara, TU/e
Monitor Synthesis for Automotive systems
The automotive world has transformed from one that combined moving mechanical parts to one that develops systems that are controlled using multiple small computers, namely electronic control units (ECUs). A typical high-end car today contains sixty to a hundred ECUs. Software in such a car consists of more than hundred million lines of code (LOC); in contrast, software in an airplane like a Boeing 787 Dreamliner contains only six and a half million LOC.
The vast amount of software causes safety in an automotive setting no longer exclusively depend on structural designs, materials, and mechanical parts, but increasingly on the software inside.
Considering the scale, verification and validation of such automotive software is challenging. The complexity and million LOC scale make classical solutions like testing, model checking, and static program analysis infeasible. Also, such methods suffer from difficulty in capturing crucial information such as timing that is available only at runtime. Runtime monitoring is a natural candidate that can complement the classical validation and verification solutions by capturing information available at runtime and can act as a safety net if something goes wrong. The large scale and rapid changes in automotive systems makes manual specification of runtime monitors, tedious, time consuming, costly, error-prone, and no-more an option, hence automated synthesis of runtime monitors is needed. The talk will explore synthesis, uses and challenges of runtime monitors for automotive systems.
|14:05-14:35||Petra van den Bos, RU
The Holy Grail of Model-Based Testing: a Complete Test Suite
Model-based testing is a technique for automatically deriving test cases from a model that describes the behaviour of a system. A big challenge in applying this technique is test selection: deriving as few as possible test cases that uncover as many bugs as possible in the system. In my presentation, I will explain how to find a complete test suite, i.e. a set of test cases that can detect any fault, for systems smaller than a predetermined size.
|15:00-15:30||Fei Yang, TU/e
An Integration of Computability and Concurrency
In this talk, I will introduce the theory of executability, which is an integration of computability and concurrency. In this theory, a notion of reactive Turing machine (RTM) is introduced as an extension of the Turing machine. We shall use the transition systems associated with RTMs as a universal model of executability. Two classes of problems are investigated in this theory, namely, the comparison between the concepts in computability and concurrency; and the evaluation of expressiveness of a process calculi with respect to executable transition systems. I shall discuss some challenges and results in this topic.
|15:30-16:00||Omar Alzuhaibi, TU/e
Automata Learning in an Engineering Context
Implementing the theory of automata learning on practical systems comes with many challenges, more so when learning legacy systems. In this talk, I will present some of the challenges I faced while learning a legacy component in an industrial setting, how I managed to overcome some of those challenges, and how I plan to solve the rest. Such challenges range from initially setting up the learning environment to dealing with asynchronous calls and parallelism to scalability and efficiency.
|16:30-17:30||Sung-Shik Jongmans, OU
Automata-Theoretic Protocol Programming
In the early 2000s, hardware manufacturers shifted their attention from manufacturing faster (yet purely sequential) unicore processors to manufacturing slower (yet increasingly parallel) multicore processors. In the wake of this shift, parallel programming became essential for writing scalable programs on general hardware. Conceptually, every parallel program consists of workers, which implement primary units of sequential computation, and protocols, which implement the rules of interaction that workers must abide by. As programmers have been writing sequential code for decades, programming workers poses no new fundamental challenges. What is new, and notoriously difficult, is programming of protocols. In this talk, I present a domain-specific language for protocols based on a theory of formal automata.
TUESDAY 7 NOVEMBER
|9:00-9:45||Bart Jacobs, KU Leuven
Separation logic-based symbolic execution for verification and analysis of pointer-manipulating programs
I introduce separation logic, a successful extension of Hoare logic for reasoning about single-threaded and multithreaded programs involving pointers and other forms of aliasing.
I explain how separation logic has been used successfully for effective symbolic execution of programs and program modules, both for modular formal verification and for whole-program analysis.
|9:45-10:30||Anton Wijs, TU/e
Verified Model-Driven Development of Multi-threaded Java Code
|11:00-12:00||Marieke Huisman, UT
Verification of Deterministic Parallel Programs
The use of separation logic has greatly advanced the verification of concurrent software. Much of this work focuses on verification of heterogeneous threads, as used e.g. in Java. In this presentation, I show how separation logic also can be used to reason about programs using homogeneous threading, i.e., where all threads execute the same instructions. Concretely, I discuss how separation logic is used to verify OpenCL kernels (for GPU architectures), as well as OpenMP compiler directives for parallelisation. All verifications are supported in the VerCors tool set for verification of concurrent software.
|14:00-14:45||Frits Vaandrager, RU
Learning Mealy Machines with Timers
(joint work with Bengt Jonsson)Active automata learning is emerging as a highly effective bug finding technique, with applications in areas such as banking cards, network protocols and legacy software. Timing often plays a crucial role in these applications, but cannot be handled adequately by existing algorithms. Even though there has been significant progress on algorithms for active learning of timed models, these approaches are not yet practical due to limited expressivity and/or high complexity.In order to address this problem, we introduce a simple model of Mealy machines with timers that is able to model the timing behavior of a large class of practical systems, and present a new and efficient algorithm for active learning of this type of automata.
|14:45-15:30||Wil Michiels, NXP and TU/e
More and more functionality in electronic devices is being implemented in software instead of hardware. Software has the advantage of being less costly, better scalable, easier to personalize, and easier to update. However, on the downside we also have that it is much more vulnerable to attacks like cloning, reverse-engineering, and code modifications. Despite some negative theoretical results on the feasibility of solving these problems, many techniques have been developed over the years that, for practical use case, often provide sufficient protection. In this talk we give an overview of such techniques.
|16:00-17:00||Roberto Giacobazzi, University of Verona and IMDEA
Securing Code — Hacking the precision of program analysis
The talk concerns the design of code protecting transformations for anti reverse engineering applications. These technologies are widely used in code protection (e.g., IP protection or key protection), malware design, anti tampering, code watermarking and birth-marking of code. The battle scenario involves attackers intended to extract information by reverse engineering the code, and protecting code transformations modeled as distorted compilers devoted to inhibit these attacks. Attacks are inhibited by maximizing imprecision in all attempts made by the attacker to exploit control and data-flow of the obscured code. After a brief survey on the state of the art in the field, we introduce a model for code obfuscation which is general enough to include generic automated static and dynamic attacks. Protecting transformations are then systematically and formally derived as distorted compilers, obtained by specializing a suitably distorted interpreter for the given programming language with respect to the source code to protect. Interestingly this distortion corresponds precisely to defeat the potency of the expected attacker, which is itself an interpreter and whose potency consists in its ability to extract a complete and precise view of program’s execution.
WEDNESDAY 8 NOVEMBER
|09:00-10:00||Frank Takes, UL
Large-Scale Network Visualization using Force-directed Algorithms
Understanding complex systems is often done through network analysis. One step is this process is network visualization, which deals with mapping a graph structure to the 2D plane. As datasets become larger, it becomes more and more common to visualize millions of nodes and links. This results in two problems: hours of computation time to obtain a readable graph layout, and hairball layouts which cannot be understood through manual network inspection. We discuss algorithms and techniques to reduce this complexity and the total running time, allowing large-scale networks to be visualized in a real-time setting. Through a use case from the field of economic network analysis we furthermore demonstrate how community detection algorithms can help to derive meaningful insights from the complex structure of the underlying network.
|10:30-11:15||Jan Martijn van der Werf, UU
Architecture Mining: Fostering Evolution through Software Operation Data Analysis
|11:15-12:00||Bram Cappers, TU/e
Network Traffic Analysis using Deep Packet Inspection and Data Visualization
Many domains nowadays try to gain insight in complex systems by logging their behavior. Telecom companies for instance analyze their communication networks for the presence of fraud, hospitals analyze patient treatments to discover bottlenecks in the process, and companies study their work flows to improve customer satisfaction. The common ground here is that these domains are interested in the analysis of events.In this presentation we will discuss the role of data visualization for the detection and explanation of anomalous patterns in event data. In particular, we will show how these visualization techniques can be used for the protection of critical infrastructures against complex virus attacks (a.k.a. Advanced Persistent Threats). In our project SpySpot, we will demonstrate the effectiveness of the system on real-world traffic including VoIP communication and Ransomware activity in file systems.
|14:00-15:00||Joost Visser, RU and SIG
Building Maintainable Software
Maintainability is a core aspect of software quality and a key driver for successful software projects and products. Though the importance of maintainability is recognised almost universally by software engineers, industrial practice shows that maintainability is still rarely managed in a rigorous way. In this presentation, I will relate some of the practical lessons and more theoretical insights accumulated over a decade of helping a wide range of organisations to manage software quality and maintainability. In particular, I will cover the initial development of a practical model for rating maintainability, the use of this model in assessment and certification of software products, the annual recalibration and evolution of the model, a scalable infrastructure for continuous monitoring of maintainability, common pitfalls when using maintainability measurements for project management, and some of the insights gained from probably the largest industrial software analysis benchmark database worldwide.Joost is CTO at the Software Improvement Group and professor of Large-scale Software Systems at Radboud University Nijmegen.
|15:30-16:15||Alexander Serebrenik, TU/e
Telling Stories about Software Developers
Understanding the ways people communicate and collaborate in software projects is an essential prerequisite to understanding, assessing and improving software. In this talk we touch upon two related topics: gender diversity in software engineering and emotions expressed by engineering teams. To study these topics we have empirically analysed a broad spectrum of software repositories, including issue trackers, mailing lists, StackOverflow questions and answers, GitHub repositories, and slack channels; furthermore, we have conducted surveys and interviews. This talk is based on series of papers we have published in 2014-2017.
|16:15-17:00||Fabio Palomba, TUD
Not Only Maintainability: Revisiting Test Smells as a Measure of Test Code Effectiveness
Test smells are poor design or implementation choices applied by programmers during the development of test cases. The research community studied the problem under different angles, showing that the presence of test smells might worsen the maintainability of source code making developers less able to evolve test code. However, maintainability is only one of the issues caused by test smells: indeed, they might have a non-negligible impact on test code effectiveness, e.g., the ability of tests to find real bugs. This talk will present test smells, their nature, and how they impact test code effectiveness. It is based on series of papers that we have published between 2015 and 2017.
THURSDAY 9 NOVEMBER
|09:00-10:00||Jurgen Vinju, CWI
Analyzing Java and C++ source code using Rascal
This is a short introduction into the Rascal metaprogramming language and its libraries for analyzing Java and C++ source code. This “lab” infra-structure is used in a wide variety of academic and industrial projects, ranging from metric suites to static analyses and large scale empirical (field) studies on open-source projects. In this talk we introduce the language and these two libraries and take your through a small number of examples. We discuss the quality of the resulting analyses in terms of accuracy (false positive and false negative rates).
|10:30-11:15||Jurriaan Hage, UU
Type-Based Static Analysis
Typed languages have a type system that is allows only programs that do not go wrong. The mechanism for specifying such a type system and their implementation and concepts have also inspired the formulation of non-standard type systems (also known as type and effect system), in that aside from the ordinary types like Int and Bool, also other properties of the program are computed like how often is a value used. In the talk I will introduce you to the basics of type systems and type and effect systems by doing a simple control flow analysis for a small functional language.
|11:15-12:00||Markus Klinik, RU
Predicting Resource Consumption of Higher-Order Workflows
We present a type and effect system for the static analysis of programs written in a simplified version of iTasks.
iTasks is a workflow specification language embedded in Clean, a general-purpose functional programming language.
Given costs for basic tasks, our analysis calculates an upper bound of the total cost of a workflow.
The analysis has to deal with the domain-specific features of iTasks, in particular parallel and sequential composition of tasks, as well as the general-purpose features of Clean, in particular let-polymorphism, higher-order functions, recursion and lazy evaluation.
Costs are vectors of natural numbers where every element represents some resource, either consumable or reusable.
|14:00-15:00||John Hughes, Chalmers University of Technology, Sweden and Quviq AB
Testing the hard stuff and staying sane
Even the best test suites can’t entirely prevent nasty surprises: race conditions, unexpected interactions, faults in distributed protocols and so on, still slip past them into production. Yet writing even more tests of the same kind quickly runs into diminishing returns. I’ll talk about new automated techniques that can dramatically improve your testing, letting you focus on what your code should do, rather than which cases should be tested–with plenty of war stories from the likes of Ericsson, Volvo Cars, and Basho Technologies, to show how these new techniques really enable us to nail the hard stuff.
|15:30-16:15||Jan Tretmans, RU and TNO-ESI
Model-Based Testing: From Theory to Practice
We build ever larger and more complex embedded systems. Systematic testing plays an important role in assessing the quality of such systems. Testing, however, turns out to be error-prone, expensive, and laborious. Consequently, better testing methods are needed that can detect more bugs faster and cheaper. Classical test automation helps but only for test execution. Model-based testing (MBT) is a promising technology that enables the next step in test automation by combining automatic test generation and test result analysis with test execution, and providing more, longer, and more diversified test cases with less effort.In the presentation, we first discuss the basic ideas and principles of MBT, with its advantages, pitfalls, and perspectives. Second, we discuss a couple of trends, issues, and challenges for testing high-tech embedded systems, and their implications for testing, such as complexity, distribution, concurrency, uncertainty, and systems-of-systems. Third, we show an MBT approach and an MBT tool dealing with these issues. The MBT approach combines the latest theoretical insights on test generation with high practical applicability and strong modelling capabilities. The MBT tool has been applied to various systems, ranging from smart-cards to large, high-tech embedded systems.
|16:15-17:00||Marcus Gerhold, UT
Model-based testing of Probabilistic Systems
FRIDAY 10 NOVEMBER
|09:30-10:30||Arend Rensink, UT
Abstract Graphs and their Transformation
The data structures built up in pointer programs can for many purposes be viewed as graphs, where the nodes are records and the edges are pointers. The manipulation of that data by a program then corresponds to the transformation of such a graph.By regarding portions of a data graph that are “similar enough” (in a sense to be defined precisely but dependent on the application) as identical, and merely recording how many of each such portion there are rather than their individual interconnections, we can arrive at a finite model that captures the essential characteristics of pointer data. The transformations can then be lifted from concrete graphs to this abstract level, giving rise to an over- or under-approximation of the reachable states that allows a partial prediction of the behaviour of the original pointer program.
In this presentation I given an overview of graph abstraction techniques that have been studied for this purpose, and identify the most promising approaches.
|11:00-11:45||Marko van Eekelen (RU and OU) and Bernard van Gastel (OU)
Towards Practical, Precise and Parametric Energy Analysis of IT Controlled Systems
Energy consumption analysis of software controlled systems can play a major role in minimising the overall energy consumption of such systems both during the development phase and later in the field. We propose such an energy analysis, analysing both software and hardware together, to derive the energy consumption of the system when executing.
The energy analysis has the property of being adjustable both to the required precision and concerning the hardware used.
In principle, this creates the opportunity to analyse which is the best software implementation for given hardware, or the other way around: choose the best hardware for a given algorithm.The precise analysis is introduced for a high level language ECA, that covers the essentials for control systems. Hardware is modelled as a finite state machine, in which the transitions are function calls that are made explicit in the source code. In these model state changes correspond to energy consumption level. For that reason timing is added to the finite state machines. All the transitions and states in the hardware models are annotated with energy consumptions, to account both for time-dependent and for incidental energy consumptions. A prototype of the ECA system has been developed. It has been applied to a small case study. A LUA version is under development.
|11:45-12:30||Rody Kersten (Carnegy Mellon University)
Complexity Vulnerability Analysis using Symbolic Execution
If the worst-case time/space complexity of an algorithm is significantly higher than the respective average case, the algorithm may be vulnerable to a Denial-of-Service attack. In this presentation, a technique for analyzing the algorithmic complexity of programs is presented, based on symbolic execution. The technique uses an efficient guided analysis to compute bounds on the worst-case complexity (for increasing input sizes) and to generate test values that trigger the worst-case behaviors. The resulting bounds are fitted to a function to obtain a prediction of the worst-case program behavior at any input sizes. Comparing these predictions to the programmers’ expectations or to theoretical asymptotic bounds can reveal vulnerabilities or confirm that a program behaves as expected.To achieve scalability, path policies are used to guide symbolic execution towards worst-case paths. The policies are learned from the worst-case results obtained with exhaustive exploration at small input sizes and are applied to guide exploration at larger input sizes, where unguided exhaustive exploration is no longer possible. To achieve precision, we use path policies that take into account the history of choices made along the path when deciding which branch to execute next in the program. Furthermore, the history
computation is context-preserving, meaning that the decision for each branch depends on the history computed with respect to the enclosing method.
The technique is implemented in the Symbolic PathFinder tool. We show experimentally that it can find vulnerabilities in complex Java programs and can outperform established symbolic techniques.